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Title Of Invention 

Cloned Genome Of Infectious 
Hepatitis C Virus of Genotype 2a And Uses Thereof 

Field Of Invention 

The present invention relates to molecular 
approaches to the production of nucleic acid sequence 
which comprises the genome of infectious hepatitis C 
virus. In particular, the invention provides a nucleic 
acid sequence which comprises the genome of an 
ijnfectious hepatitis C virus of genotype 2a. The 
invention therefore relates to the use of the nucleic 
acid sequence and polypeptides encoded by all or part of 
the sequence in the development of vaccines and 
diagnostic assays for HCV and in the development of 
screening assays for the identification of antiviral 
agents for HCV. 

Background Of Invention 

Hepatitis C virus (HCV) has a positive-sense 
single-strand RNA genome and is a member of the genus 
Hepaclvirus within the Flaviviridae family of viruses 
(Rice, 1996) . As for all positive-stranded RNA viruses, 
the genome of HCV functions as mRNA from which all viral 
proteins necessary for propagation are translated. 

The viral genome of HCV is approximately 9600 
nucleotides (nts) in length and consists of a highly 
conserved 5' untranslated region (UTR) , a single long 
open reading frame (ORF) of approximately 9, 000 nts and 
a complex 3' UTR. The 5' UTR contains an internal 
ribosomal entry site (Tsukiyama-Kohara et al., 1992; 
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Honda et al., 1996). The 3' UTR consists of a short 
variable region, a polypyrimidine tract of variable 
length and, at the 3' end, a highly conserved region of 
approximately 100 nucleotides (Kolykhalov et al., 1996; 
Tanaka et al., 1995; Tanaka et al., 1996; Yamada et al., 
1996) . The last 46 nucleotides of this conserved region 
were predicted to form a stable stem-loop structure 
thought to be critical for viral replication (Blight and 
Rice, 1997; Ito and Lai, 1997; Tsuchihara et al., 1997). 
The ORF encodes a large polypeptide precursor that is 
cleaved into at least 10 proteins by host and viral 
proteinases (Rice, 1996) . The predicted envelope 
proteins contain several conserved N-linked 
glycosylation sites and cysteine residues (Okamoto et 
al., 1992a)- The NS3 gene encodes a serine protease and 
an RNA helicase and the NS5B gene encodes an RNA- 
dependent RNA polymerase. 

A remarkable characteristic of HCV is its 
genetic heterogeneity, which is manifested throughout 
the genome (Bukh et al., 1995). The most heterogeneous 
regions of the genome are found in the envelope genes, 
in particular the hypervariable region 1 (HVR1) at the 
N-terminus of E2 (Hijikata et al,, 1991; Weiner et al., 
1991) . HCV circulates as a quasispecies of closely 
related genomes in an infected individual. Globally, 
six major HCV genotypes (genotypes 1-6) and multiple 
30 subtypes (a, b, c, etc.) have been identified (Bukh et 
al., 1993; Simmonds et al., 1993). 

The nucleotide and deduced amino acid 
sequences among isolates within a quasispecies generally 
differ by < 2%, whereas those between isolates of 
different genotypes vary by as much as 35%. Genotypes 
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1, 2 and 3 are found worldwide and constitute more than 
90% of the HCV infections in North and South America, 
Europe, Russia, China, Japan and Australia (Forns and 
Bukh, 1998) . Throughout these regions genotype 1 
5 accounts for the majority of HCV infections but 
genotypes 2 and 3 each account for 5-15%. 

At present, more than 80% of individuals 
infected with HCV become chronically infected and these 
chronically infected individuals have a relatively high 

10 risk of developing chronic hepatitis, liver cirrhosis 
and hepatocellular carcinoma (Hoofnagle, 1997). The 
only effective therapy for chronic hepatitis C, 
interferon (IFN) , alone or in combination with 

j5 ribavirin, induces a sustained response in less than 50% 
of treated patients (Davis et al., 1998; McHutchinson et 
al., 1998). Consequently, HCV is currently the most 
common cause of end stage liver failure and the reason 
for about 30% of liver transplants performed in the U.S. 

20 

(Hoofnagle, 1997). In addition, a number of recent 
studies suggested that the severity of liver disease and 
the outcome of therapy may be genotype-dependent 

(reviewed in Bukh et al., 1997). In particular, these 
25 studies suggested that infection with HCV genotype lb 

was associated with more severe liver disease (Brechot, 
1997) and a poorer response to I FN therapy (Fried and 
Hoofnagle, 1995) . As a result of the inability to 
develop a universally effective therapy against HCV 

30 

infection, it is estimated that there are still more 
than 25,000 new infections yearly in the U.S. (Alter 
1997) Moreover, since there is no vaccine for HCV, HCV 
remains a serious public health problem. 

35 
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Despite the intense interest in the 
development of vaccines and therapies for HCV, progress 
has been hindered by the absence of a useful cell 
culture system and the lack of any small animal model 
for laboratory study. For example, while replication of 
HCV in several cell lines has been reported, such 
observations have turned out not to be highly 
reproducible. In addition, the chimpanzee is the only 
animal model, other than man, for this disease. 
Consequently, HCV has been studied only by using 
clinical materials obtained from patients or 
experimentally infected chimpanzees, an animal model 
whose availability is very limited. 
15 However, several researchers have recently 

reported the construction of infectious cDNA clones of 
HCV, the identification of which would permit a more 
effective search fbr susceptible cell lines and 
facilitate molecular analysis of the viral genes and 
their function. For example, Yoo et al., and Dash et 
al., (1997) (1995) reported that RNA transcripts from 
cDNA clones of HCV-1 (genotype la) and HCV-N (genotype 
lb) , respectively, resulted in viral replication after 
25 transfection into human hepatoma cell lines. 

Unfortunately, the viability of these clones was not 
tested in vivo and concerns were raised about the 
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infectivity of these cDNA clones in vitro (Fausto, 
1997) . In addition, both clones did not contain the 
terminal 98 conserved nucleotides at the very 3' end of 
the UTR. 

Kolykhalov et al., (1997) and Yanagi et al. 
(1997 r 1998) reported the derivation from HCV strains 
H77 (genotype la) and HC-J4 (genotype lb) of cDNA clones 
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of HCV that are infectious for chimpanzees. However, 
while these infectious clones will aid in studying HCV 
replication and pathogenesis and will provide an 
important tool for development of in vitro replication 
and propagation systems, it is important to have 
infectious clones of more than one genotype, given the 
extensive genetic heterogeneity of HCV and the potential 
impact of such heterogeneity on the development of 
effective therapies and vaccines for HCV. 

In addition, synthetic chimeric viruses can be 
used to map the functional regions of viruses with 
different phenotypes. In flaviviruses and pestiviruses, 
infectious chimeric viruses have been successfully 
engineered to express different functional units of 
related viruses (Bray and Lai, 1991; Pletnev et al., 
1992, 1998; Vassilev et ai., 1997) and in some cases it 
has been possible to make chimeras between non-related 
or distantly related viruses. For instance, the IRES 
element of poliovirus or bovine viral diarrhea virus has 
been replaced with IRES sequences from HCV (Frolov et 
al., 1998; Lu and Wimmer, 1996; Zhao et al., 1999). 
Recently, the construction of an infectious chimera of 
two closely related HCV subtypes has been reported. The 
chimera contained the complete ORF of a genotype lb 
strain but had the 5' and 3' termini of a genotype la 
strain (Yanagi et al., 1998). 

It is important to determine whether chimeras 
constructed from more divergent HCV strains are 
infectious because such chimeras could be used to define 
the functions of viral units and to dissect the immune 
response. 
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Summary Of The Invention 

The present invention relates to nucleic acid 
sequence which comprises the genome of infectious 
hepatitis C virus and in particular, nucleic acid 
sequence which comprises the genome of infectious 
hepatitis C virus of genotype 2a. It is therefore an 
object of the invention to provide nucleic acid sequence 
which encodes infectious hepatitis C virus. Such 
nucleic acid sequence is referred to throughout the 
application as "infectious nucleic acid sequence". 

For the purposes of this application, nucleic 
acid sequence refers to RNA, DNA f cDNA or any variant 
thereof capable of directing host organism synthesis of 
15 a hepatitis C virus polypeptide. It is understood that 
nucleic acid sequence encompasses nucleic acid 
sequences, which due to degeneracy, encode the same 
polypeptide sequence as the nucleic acid sequences 
described herein. 

The invention also relates to the use of the 
infectious nucleic acid sequences to produce chimeric 
genomes consisting of portions of the open reading 
frames of nucleic acid sequences of other genotypes 
(including, but not limited to, genotypes 1, 2, 3, 4, 5 
and 6) and subtypes (including, but not limited to, 
subtypes la r lb, 2a, 2b, 2c, 3a, 4a-4f, 5a and 6a) of 
HCV. For example, infectious nucleic acid sequence of 
the 2a strain HC-J6, described herein can be used to 
produce chimeras with sequences from the genomes of 
other strains of HCV from different genotypes or 
subtypes. Nucleic acid sequences which comprise 
sequences from two or more HCV genotypes or subtypes are 
designated "chimeric nucleic acid sequences". 
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The invention further relates to mutations of 
the infectious nucleic acid sequence of the invention 
where mutation includes, but is not limited to, point 
mutations, deletions and insertions* In one embodiment, 
5 a gene or fragment thereof can be deleted to determine 
the effect of the deleted gene or genes on the 
properties of the encoded virus such as its virulence 
and its. ability to replicate- In an alternative 
embodiment, a mutation may be introduced into the 

10 

infectious nucleic acid sequences to examine the effect 
of the mutation on the properties of the virus. 

The invention also relates to the introduction 
of mutations or deletions into the infectious nucleic 
15 acid sequence in order to produce an attenuated 

hepatitis C virus suitable for vaccine development. 

The invention further relates to the use of 
the infectious nucleic acid sequence to produce 
attenuated viruses via passage in vitro or in vivo of 

20 

the viruses produced by transfection of a host cell with 
the infectious nucleic acid sequence. 

The present invention also relates to the use 
of the nucleic acid sequence of the invention or 

25 fragments thereof in the production of polypeptides 

where "nucleic acid sequence of the invention" refers to 
infectious nucleic acid sequence, mutations of 
infectious nucleic acid sequence, chimeric nucleic acid 
sequence and sequences which comprise the genome of 
attenuated viruses produced from the infectious nucleic 
acid sequence of the invention. In one embodiment, said 
polypeptide or polypeptides are fully or partially 
purified from hepatitis C virus produced by cells 

35 transfected with nucleic acid sequence of the invention. 
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In another embodiment, the polypeptide or polypeptides 
are produced recombinantly from a fragment of the 
nucleic acid sequences of the invention. In yet another 
embodiment, the polypeptides are chemically synthesized. 

The polypeptides of the invention, especially 
structural polypeptides, can serve as immunogens in the 
development of vaccines or as antigens in the 
development of diagnostic assays for detecting the 
presence of HCV in biological samples. 

The invention therefore also relates to 
vaccines for use in immunizing mammals especially humans 
against hepatitis C. In one embodiment , the vaccine 
comprises one or more polypeptides made from the nucleic 
15 acid sequence of the invention or fragment thereof. In 
a second embodiment, the vaccine comprises a hepatitis C 
virus produced by transfection of host cells with the 
nucleic acid sequences of the invention. 

The present invention therefore relates to 
methods for preventing hepatitis C in a mammal. In one 
embodiment the method comprises administering to a 
mammal a polypeptide or polypeptides encoded by the 
nucleic acid sequence of the invention in an amount 
25 effective to induce protective immunity to hepatitis C. 
In another embodiment, the method of prevention 
comprises administering to a mammal a hepatitis C virus 
of the invention in an amount effective to induce 
protective immunity against hepatitis C. 

In yet another embodiment, the method of 
protection comprises administering to a mammal the 
nucleic acid sequence of the invention or a fragment 
thereof in an amount effective to induce protective 
immunity against hepatitis C. 
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The invention also relates to hepatitis C 
viruses produced by host cells transfected with the 
nucleic acid sequence of the present invention. 

The invention therefore also provides 
5 pharmaceutical compositions comprising the nucleic acid 
sequence of the invention and/or the encoded hepatitis C 
viruses. The invention further provides pharmaceutical 
compositions comprising polypeptides encoded by the 
nucleic acid sequence of the invention or fragments 

10 

thereof. The pharmaceutical compositions of the 
invention may be used prophylact ically or 
therapeutically. 

The invention also relates to antibodies to 
15 the hepatitis C virus of the invention or their encoded 
polypeptides and to pharmaceutical compositions 
comprising these antibodies. 

The invention also relates to the use of the 
nucleic acid sequences of the invention to identify cell 
lines capable of supporting the replication of HCV in 
vitro . 

The invention further relates to the use of 
the nucleic acid sequences of the invention or their 
encoded viral enzymes (e.g. NS3 serine protease, NS3 
helicase r NS5B RNA polymerase) to develop screening 
assays to identify antiviral agents for HCV. 

Brief Description Of Figures 

Figure 1 shows the amplification and cloning 
of hepatitis C virus genotype 2a (strain HC-J6 C h). The 
nucleotide positions correspond to the sequence of 
PJ6CF, a full length cDNA clone of hepatitis C virus, 
genotype 2a, strain HC-J6 CH . Products from polymerase 
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chain reaction are also shown. The names of the clones 
obtained from these products are indicated (number of 
clones sequenced are shown in parenthesis) . The 
composition of the full-length cDNA clone is shown at 
5 the bottom. The restriction enzymes used for cloning 

are indicated. An Xbal site in HC-J6 C h was eliminated by 
a silent substitution at position 5494. 

Figure 2 shows tree analysis of clones 
amplified from an infectious acute phase plasma pool 

10 

generated in a chimpanzee inoculated with human plasma 
containing strain HC-J6 (Okamoto et al., 1991) as well 
as a tree of the predicted polyprotein sequence of 
HC-J6 C h and the infectious HOJ6 C h cDNA clone (pJ6CF) . 
15 The nucleotide positions with deletions or insertions 
were stripped in the analysis of the clones. Multiple 
sequence alignments and tree analyses were performed 
with GeneWorks (Oxford Molecular Group) (Bukh et al., 
1995) . Genotype designations are indicated. Other 

20 

sequences included in the analysis are HC-J8 (Okamoto et 
al., 1992) , genotype la infectious clone BEBE1 (Nakao et 
al., 1996), H77C (Yanagi et al., 1997); genotype lb 
infectious clone J4L6S (Yanagi et al*, 1998). The scale 

25 in each tree indicates the calculated genetic distance. 

Figure 3 shows the alignment of the 
hypervariable region 1 sequences from 8 J6S clones of 
strain HC-J6 C h- HC-J6 C h represents the consensus amino 
acid sequence of the infectious plasma pool from an 
experimentally infected chimpanzee. HOJ6 is the 
published amino acid sequence of the original inoculum 
(Okamoto et al., 1991). 

Figure 4 shows the construction of four 

35 intertypic chimeric cDNA clones. White boxes are 
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sequences derived from genotype 2a clone pJ6CF f and 

o 

black boxes are sequences derived from genotype la clone 
pCV-H77C (Yanagi et al., 1997). An Ndel site (mutation 
at position 9158 of pCV-H77C) was eliminated and an 
artificial Ndel site (mutation at position 2765 of 
5 pCV-H77C) was created by site-directed mutagenesis; 
silent mutations are underlined. 

Figures 5A and 5B show the alignment of the 
nucleotide sequences of the 5 f (Fig. 5A) and 3 f UTRs 
(Fig. 5B) and the amino acid sequences of E2/p7/NS2 

10 junctions (Fig. 5B) in the intertypic la, 2a chimeric 

cDNA clones. In the 5' UTR alignment, the first 39 nts 
of core believed to be important for the IRES function 
were included (Lemon and Honda, 1997) . Top line: the 

15 sequence of the infectious genotype la clone pCV-H77C 

(Yanagi et al., 1997). Bottom line: the sequence of the 
infectious genotype 2a clone pJ6CF. Dot: identity with 
the sequence of H77C. Capital letter: different from the 
sequence of H77C. Dash: deletion. Bold face: initiation 

20 

or stop codon of the ORF. Underlined: Agel cleavage 
site. Arrow: putative sites in the HCV polyprotein 
cleaved by host signal peptidases. Numbering 
corresponds to the sequence of pCV-H77C. 

25 Figures 6A-6F show the nucleotide sequence of 

the infectious hepatitis C virus clone of genotype la 
strain H77C and Figures 6G-6H show the amino acid 
sequence encoded by the clone. 

3Q Figures 7A-7F show the nucleotide sequence of 

the infectious hepatitis C virus clone of genotype lb 
strain HC-J4 and Figures 7G-H show the amino acid 
sequence encoded by the clone. 
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DESCRIPTION OF THE INVENTION 

The present invention relates to nucleic acid 
sequence which comprises the genome of an infectious 
hepatitis C virus. More specifically, the invention 
relates to nucleic acid sequence which encodes 
infectious hepatitis C virus of strain HC-J6 C h, genotype 
2a. The infectious nucleic acid sequence of the 
invention is shown in SEQ ID NO:l and is contained in a 
plasmid construct deposited with the American Type 
Culture Collection (ATCC) on May 28, 1999 and having 
ATCC accession number PTA-153. 

The invention also relates to "chimeric 
nucleic acid sequences" where the chimeric nucleic acid 
15 sequences consist of open-reading frame sequences and/or 
5' and/or 3' untranslated sequences taken from nucleic 
acid sequences of hepatitis C viruses of different 
genotypes or subtypes. 

In one embodiment, the chimeric nucleic acid 
sequence consists of sequence from the genome of 
infectious HCV of genotype 2a which encodes structural 
polypeptides and sequence from the genome of a HCV of a 
different genotype or subtype which encodes 
nonstructural polypeptides . 

Alternatively, the nonstructural region of 
infectious HCV of genotype 2a and structural region of a 
HCV of a different genotype or subtype may be combined. 
30 This will result in a chimeric nucleic acid sequence 

consisting of sequence from the genome of infectious HCV 
of genotype 2a which encodes nonstructural polypeptides 
and sequence from the genome of a HCV of a another 
genotype or subtype which encodes structural 
polypeptides. 
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Preferably, the nucleic acid sequence from the 
genome of the infectious HCV clone of genotype la 
(deposited with the ATCC on June 2, 1999 ; Figures 6A- 
6F), or the nucleic acid sequence from the genome of the 
5 infectious HCV clone of genotype lb (ATCC accession 

number 209596; Figures 7A-7F) is used to construct the 
chimeric nucleic acid sequence with the HCV of genotype 
2a of the invention. 

It is believed that the construction of such 

10 

chimeric nucleic acid sequences will be of importance in 
studying the growth and virulence properties of 
hepatitis C virus and in the production of candidate 
hepatitis C virus vaccines suitable to confer protection 
15 against multiple genotypes of HCV. For example, one 
might produce a "multivalent" vaccine by putting 
epitopes from several genotypes or subtypes into one 
clone. Alternatively one might replace just a single 
gene from an infectious sequence with the corresponding 

20 

gene from the genomic sequence of a strain ftom another 
genotype or subtype or create a chimeric gene which 
contains portions of a gene from two genotypes or 
subtypes. Examples of genes which could be replaced or 
25 which could be made chimeric, include, but are not 
limited to, the El, E2 and NS4 genes. 

The invention further relates to mutations of 
the infectious nucleic acid sequences where "mutations" 
include, but are not limited to, point mutations, 

30 

deletions and insertions. Of course, one of ordinary 
skill in the art would recognize that the size of the 
insertions would be limited by the ability of the 
resultant nucleic acid sequence to be properly packaged 
35 within the virion. Such mutations could be produced by 
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techniques known to those of skill in the art such as 
site-directed mutagenesis, fusion PCR, and restriction 
digestion followed by religation. 

In one embodiment, mutagenesis might be 
5 undertaken to determine sequences that are important for 
viral properties such as replication or virulence. For 
example, one may introduce a mutation into the 
infectious nucleic acid sequence which eliminates the 
cleavage site between the NS4A and NS4B polypeptides to 

10 

examine the effects on viral replication and processing 
of the polypeptide . 

Alternatively, one may delete all or part of a 
gene or of the 5' or 3' nontranslated region contained in 
15 an infectious nucleic acid sequence and then transfect a 
host cell (animal or cell culture) with the mutated 
sequence and measure viral replication in the host by 
methods known in the art such as RT-PCR. Preferred 
genes include, but are not limited to, the P7, NS4B and 
NS5A genes. Of course, those of ordinary skill in the 
art will understand that deletion of part of a gene, 
preferably the central portion of the gene, may be 
preferable to deletion of the entire gene in order to 
conserve the cleavage site boundaries which exist 
between proteins in the HCV polyprotein and which are 
necessary for proper processing of the polyprotein. 

In the alternative, if the transfection is 
into a host animal such as a chimpanzee, one can monitor 
the virulence phenotype of the virus produced by 
transfection of the mutated infectious nucleic acid 
sequence by methods known in the art such as measurement 
of liver enzyme levels (alanine aminotransferase (ALT) 
or isocitrate dehydrogenase (ICD) ) or by histopathology 
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of liver biopsies. Thus, mutations of the infectious 
nucleic acid sequences may be useful in the production 
of attenuated HCV strains suitable for vaccine use. 

The invention also relates to the use of the 
5 infectious nucleic acid sequence of the present 

invention to produce attenuated viral strains via 
passage in vitro or in vivo of the virus produced by 
transfection with the infectious nucleic acid sequence. 

The present invention therefore relates to the 
^ use of the nucleic acid sequence of the invention to 
identify cell lines capable of supporting the 
replication of HCV. 

In particular, it is contemplated that the 
15 mutations of the infectious nucleic acid sequence of the 
invention and the production of chimeric sequences as 
discussed above may be useful in identifying sequences 
critical for cell culture adaptation of HCV and hence, 
may be useful in identifying cell lines capable of 

20 

supporting HCV replication. 

Transfection of tissue culture cells with the 
nucleic acid sequences of the invention may be done by 
methods of transfection known in the art such as 
25 electroporation, precipitation with DEAE-Dextran or 
calcium phosphate or liposomes. 

In one such embodiment, the method comprises 

the growing of animal cells, especially human cells, in 

vitro and transfecting the cells with the nucleic acid 
30 

of the invention, then determining if the cells show 
indicia of HCV infection. Such indicia include the 
detection of viral antigens in the cell, for example, by 
immunofluorescence procedures well known in the art; the 
35 detection of viral polypeptides by Western blotting 
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using antibodies specific therefor; and the detection of 
newly transcribed viral RNA within the cells via methods 
such as RT-PCR. The presence of live, infectious virus 
particles following such tests may also be shown by 
5 injection of cell culture medium or cell lysates into 

healthy, susceptible animals, with subsequent exhibition 
of the signs and symptoms of HCV infection . 

Suitable cells or cell lines for culturing HCV 
include, but are not limited to, lymphocyte and 

10 

hepatocyte cell lines known in the art. 

Alternatively, primary hepatocytes can be 
cultured, and then infected with HCV; or, the hepatocyte 
cultures could be derived from the livers of infected 
15 chimpanzees. In addition, various immortalization 

methods known to those of ordinary skill in the art can 
be used to obtain cell lines derived from hepatocyte 
cultures. For exairtple, primary hepatocyte cultures may 
be fused to a variety of cells to maintain stability. 

20 

The present invention further relates to the 
in vitro and in vivo production of hepatitis C viruses 
from the nucleic acid sequences of the invention. 

In one embodiment, the sequences of the 
25 invention can be inserted into an expression vector that 
functions in eukaryotic cells. Eukaryotic expression 
vectors suitable for producing high efficiency gene 
transfer in vivo are well known to those of ordinary 
skill in the art and include, but are not limited to, 

30 

plasmids, vaccinia viruses, retroviruses, adenoviruses 
and adeno-associated viruses. 

In another embodiment, the sequences contained 
in the recombinant expression vector can be transcribed 
35 in vitro by methods known to those of ordinary skill in 
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the art in order to produce RNA transcripts which encode 
the hepatitis C viruses of the invention. The hepatitis 
C viruses of the invention may then be produced by 
transfecting cells by methods known to those of ordinary 
5 skill in the art with either the in vitro transcription 
mixture containing the RNA transcripts or with the 
recombinant expression vectors containing the nucleic 
acid sequences described herein* 

The hepatitis C viruses produced from the 
^ sequences of the invention may be purified or partially 
purified from the transfected cells by methods known to 
those of ordinary skill in the art. In a preferred 
embodiment, the viruses are partially purified prior to 
15 their use as immunogens in the pharmaceutical 

compositions and vaccines of the present invention. 

The present invention therefore relates to the 
use of the hepatitis C viruses produced from the nucleic 
acid sequences of the invention as immunogens in live or 

20 

killed ( e.g. , formalin inactivated) vaccines' to prevent 
hepatitis C in a mammal. 

In an alternative embodiment, the immunogen of 
the present invention may be an infectious nucleic acid 
25 sequence, a chimeric nucleic acid sequence, or a mutated 
infectious nucleic acid sequence which encodes a 
hepatitis C virus. Where the sequence is a cDNA 
sequence, the cDNAs and their RNA transcripts may be 
used to transfect a mammal by direct injection into the 

30 

liver tissue of the mammal as described in the Examples. 

Alternatively, direct gene transfer may be 
accomplished via administration of a eukaryotic 
expression vector containing a nucleic acid sequence of 
35 the invention. 
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In yet another embodiment, the immunogen may 
be a polypeptide encoded by the nucleic acid sequences 
of the invention. The present invention therefore also 
relates to polypeptides produced from the nucleic acid 
5 sequences of the invention or fragments thereof. In one 
embodiment , polypeptides of the present invention can be 
recombinantly produced by synthesis from the nucleic 
acid sequences of the invention or isolated fragments 
thereof, and purified, or partially purified, from 

10 

transfected cells using methods already known in the 
art* In an alternative embodiment, the polypeptides may 
be purified or partially purified from viral particles 
produced via transfection of a host cell with the 
15 nucleic acid sequences of the invention. Such 

polypeptides might, for example, include either capsid 
or envelope polypeptides prepared from the sequences of 
the present invention. 

When used as immunogens, the nucleic acid 

20 

sequences of the invention, or the polypeptides or 
viruses produced therefrom, are preferably partially 
purified prior to use as immunogens in pharmaceutical 
compositions and vaccines of the present invention. 

25 When used as a vaccine, the sequences and the 
polypeptide and virus products thereof, can be 
administered alone or in a suitable diluent, including, 
but not limited to, water, saline, or some type of 
buffered medium. The vaccine according to the present 
invention may be administered to an animal, especially a 
mammal, and most especially a human, by a variety of 
routes, including, but not limited to, intradermally, 
intramuscularly, subcutaneously, or in any combination 

35 thereof. 
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Suitable amounts of material to administer for 
prophylactic and therapeutic purposes will vary 
depending on the route selected and the immunogen 
(nucleic acid, virus, polypeptide) administered. One 
5 skilled in the art will appreciate that the amounts to 
be administered for any particular treatment protocol 
can be readily determined without undue experimentation. 
The vaccines of the present invention may be 
administered once or periodically until a suitable titer 
10 of anti-HCV antibodies appear in the blood. For an 
immunogen consisting of a nucleic acid sequence, a 
suitable amount of nucleic acid sequence to be used for 
prophylactic purposes might be expected to fall in the 
15 range of from about 100 fig to about 5 mg and most 

preferably in the range of from about 500 |ig to about 
2mg. For a polypeptide, a suitable amount to use for 
prophylactic purposes is preferably 100 ng to 100 \xg and 
for a virus 10 2 to 10 6 infectious doses. Such 
administration will, of course, occur prior to any sign 
of HCV infection. 

A vaccine of the present invention may be 
employed in such forms as capsules, liquid solutions, 
suspensions or elixirs for oral administration, or 
sterile liquid forms such as solutions or suspensions. 
An inert carrier is preferably used, such as saline or 
phosphate-buffered saline, or any such carrier in which 
the HCV of the present invention can be suitably 
suspended. The vaccines may be in the form of single 
dose preparations or in multi-dose flasks which can be 
utilized for mass-vaccination programs of both animals 
and humans. For purposes of using the vaccines of the 
present invention reference is made to Remington's 
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Pharmaceutical Sciences , Mack Publishing Co., Easton, 
Pa., Osol (Ed.) (1980); and New Trends and Developments 
in Vaccines , Voller et al. (Eds.), University Park 
Press, Baltimore, Md. (1978), both of which provide much 

5 useful information for preparing and using vaccines. Of 
course, the polypeptides of the present invention, when 
used as vaccines, can include, as part of the 
composition or emulsion, a suitable adjuvant, such as 
alum (or aluminum hydroxide) when humans are to be 

^ vaccinated, to further stimulate production of 

antibodies by immune cells. When nucleic acids, viruses 
or polypeptides are used for vaccination purposes, other 
specific adjuvants such as CpG motifs (Krieg, A.K. et 

15 al.(1995) and (1996)), may prove useful. 

When the nucleic acids, viruses and 
polypeptides of the present invention are used as 
vaccines or inocula, they will normally exist as 
physically discrete units suitable as a unitary dosage 

20 

for animals, especially mammals, and most especially 
humans, wherein each unit will contain a predetermined 
quantity of active material calculated to produce the 
desired immunogenic effect in association with the 
25 required diluent . The dose of said vaccine or inoculum 
according to the present invention is administered at 
least once. In order to increase the antibody level, a 
second or booster dose may be administered at some time 
after the initial dose. The need for, and timing of, 

30 

such booster dose will, of course, be determined within 
the sound judgment of the administrator of such vaccine 
or inoculum and according to sound principles well known 
in the art. For example, such booster dose could 
35 reasonably be expected to be advantageous at some time 
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between about 2 weeks to about 6 months following the 
initial vaccination. Subsequent doses may be 
administered as indicated. 

The nucleic acid sequences, viruses and 
5 polypeptides of the present invention can also be 

administered for purposes of therapy, where a mammal, 
especially a primate, and most especially a human, is 
already infected, as shown by well known diagnostic 
measures. When the nucleic acid sequences, viruses or 

^ polypeptides of the present invention are used for such 
therapeutic purposes, much of the same criteria will 
apply as when it is used as a vaccine, except that 
inoculation will occur post-infection. Thus, when the 

15 nucleic acid sequences, viruses or polypeptides of the 
present invention are used as therapeutic agents in the 
treatment of infection, the therapeutic agent comprises 
a pharmaceutical composition containing a sufficient 
amount of said nucleic acid sequences, viruses or 

20 

polypeptides so as to elicit a therapeutically effective 
response in the organism to be treated. Of course, the 
amount of pharmaceutical composition to be administered 
will, as for vaccines, vary depending on the immunogen 
25 contained therein (nucleic acid, polypeptide, virus) and 
on the route of administration. 

The therapeutic agent according to the present 
invention can thus be administered by subcutaneous, 
intramuscular or intradermal routes. One skilled in the 

30 

art will certainly appreciate that the amounts to be 
administered for any particular treatment protocol can 
be readily determined without undue experimentation. Of 
course, the actual amounts will vary depending on the 
35 route of administration as well as the sex, age, and 
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clinical status of the subject which, in the case of 
human patients, is to be determined with the sound 
judgment of the clinician. 

The therapeutic agent of the present invention 
5 can be employed in such forms as capsules, liquid 

solutions, suspensions or elixirs, or sterile liquid 
forms such as solutions or suspensions. An inert carrier 
is preferably used, such as saline, phosphate-buffered 
saline, or any such carrier in which the HCV of the 

10 

present invention can be suitably suspended. The 
. therapeutic agents may be in the form of single dose 
preparations or in the multi-dose flasks which can be 
utilized for mass-treatment programs of both animals and 
15 humans. Of course, when the nucleic acid sequences, 
viruses or polypeptides of the present invention are 
used as therapeutic agents they may be administered as a 
single dose or as a" series of doses, depending on the 
situation as determined by the person conducting the 

20 

treatment. 

The nucleic acids, polypeptides and viruses of 
the present invention can also be utilized in the 
production of antibodies against HCV. The term 
"antibody" is herein used to refer to immunoglobulin 
molecules and immunologically active portions of 
immunoglobulin molecules. Examples of antibody 
molecules are intact immunoglobulin molecules, 
substantially intact immunoglobulin molecules and 
portions of an immunoglobulin molecule, including those 
portions known in the art as Fab, F(ab') 2 and F(v) as 
well as chimeric antibody molecules. 

Thus, the polypeptides, viruses and nucleic 
acid sequences of the present invention can be used in 
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the generation of antibodies that immunoreact (i.e., 
specific binding between an antigenic determinant- 
containing molecule and a molecule containing an 
antibody combining site such as a whole antibody 
molecule or an active portion thereof) with antigenic 
determinants on the surface of hepatitis C virus 
particles . 

The present invention therefore also relates 
to antibodies produced following immunization with the 
nucleic acid sequences , viruses or polypeptides of the 
present invention- These antibodies are typically 
produced by immunizing a mammal with an immunogen or 
vaccine, to induce antibody molecules having 
15 immunospecif icity for polypeptides or viruses produced 

in response to infection with the nucleic acid sequences 
of the present invention* When used in generating such 
antibodies, the nucleic acid sequences, viruses, or 
polypeptides of the present invention may be linked to 
some type of carrier molecule. The resulting antibody 
molecules are then collected from said mammal. 
Antibodies produced according to the present invention 
have the unique advantage of being generated in response 
25 to authentic, functional polypeptides produced according 
to the actual cloned HCV genome. 

The antibody molecules of the present 
invention may be polyclonal or monoclonal. Monoclonal 
antibodies are readily produced by methods well known in 
the art. Portions of invmunoglobin molecules, such as 
Fabs, as well as chimeric antibodies, may also be 
produced by methods well known to those of ordinary 
skill in the art of generating such antibodies. 

35 
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The antibodies according to the present 
invention may also be contained in blood, plasma, serum, 
hybridoma supernatants, and the like. Alternatively, 
the antibody of the present invention is isolated to the 
5 extent desired by well known techniques such as, for 

example, using DEAE Sephadex. The antibodies produced 
according to the present invention may be further 
purified so as to obtain specific classes or subclasses 
of antibody such as IgM, IgG, IgA, and the like. 

10 

Antibodies of the IgG class are preferred for purposes 
of passive protection. 

The antibodies of the present invention are 
useful in the prevention and treatment of diseases 
15 caused by hepatitis C virus in animals, especially 
mammals, and most especially humans. 

In providing the antibodies of the present 
invention to a recipient mammal, preferably a human, the 
dosage of administered antibodies will vary depending on 
such factors as the mammal's age, weight, height, sex, 
general medical condition, previous medical history, and 
the like. 

In general, it will be advantageous to provide 
the recipient mammal with a dosage of antibodies in the 
range of from about 1 mg/kg body weight to about 10 
mg/kg body weight of the mammal, although a lower or 
higher dose may be administered if found desirable. 
Such antibodies will normally be administered by 
intravenous or intramuscular route as an inoculum. The 
antibodies of the present invention are intended to be 
provided to the recipient subject in an amount 
sufficient to prevent, lessen or attenuate the severity, 
extent or duration of any existing infection. 
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The antibodies prepared by use of the nucleic 
acid sequences, viruses or polypeptides of the present 
invention are also highly useful for diagnostic 
purposes. For example, the antibodies can be used as in 
vitro diagnostic agents to test for the presence of HCV 
in biological samples taken from animals, especially 
humans- Such assays include, but are not limited to, 
radioimmunoassays, EIA, fluorescence. Western blot 
analysis and ELISAs. In one such embodiment, the 
biological sample is contacted with antibodies of the 
present invention and a labeled second antibody is used 
to detect the presence of HCV to which the antibodies 
are bound. 

Such assays may be, for example, direct where 
the labeled first antibody is immunoreactive with the 
antigen, such as, for example, a polypeptide on the 
surface of the virus; indirect where a labeled second 
antibody is reactive with the first antibody; a 
competitive protocol such as would involve the addition 
of a labeled antigen; or sandwich where both labeled and 
unlabeled antibody are used, as well as other protocols 
well known and described in the art. 
25 In one embodiment, an immunoassay method would 

utilize an antibody specific for HCV envelope 
determinants and would further comprise the steps of 
contacting a biological sample with the HCV-specific 
antibody and then detecting the presence of HCV material 
in the test sample using one of the types of assay 
protocols as described above. Polypeptides and 
antibodies produced according to the present invention 
may also be supplied in the form of a kit, either 
35 present in vials as purified material, or present in 
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compositions and suspended in suitable diluents as 
previously described. 

In a preferred embodiment, such a diagnostic 
test kit for detection of HCV antigens in a test sample 
5 comprises in combination a series of containers, each 
container a reagent needed for such assay. Thus, one 
such container would contain a specific amount of HCV- 
specific antibody as already described, a second 
container would contain a diluent for suspension of the 
sample to be tested, a third container would contain a 
positive control and an additional container would 
contain a negative control. An additional container 
could contain a blank. 

For all prophylactic, therapeutic and 
diagnostic uses, the antibodies of the invention and 
other reagents, plus appropriate devices and 
accessories, may be provided in the form of a kit so as 
to facilitate ready availability and ease of use. 

The present invention also relates to the use 
of nucleic acid sequences and polypeptides of the 
present invention to screen potential antiviral agents 
for antiviral activity against HCV* Such screening 
methods are known by those of skill in the art. 
Generally, the antiviral agents are tested at a variety 
of concentrations, for their effect on preventing viral 
replication in cell culture systems which support viral 
replication, and then for an inhibition of infectivity 
or of viral pathogenicity (and a low level of toxicity) 
in an animal model system. 

In one embodiment, animal cells (especially 
human cells) transfected with the nucleic acid sequences 
of the invention are cultured in vitro and the cells are 
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treated with a candidate antiviral agent (a chemical, 
peptide etc.) by adding the candidate agent to the 
medium. The treated cells are then exposed, possibly 
under transfecting or fusing conditions known in the 

5 art, to the nucleic acid sequences of the present 

invention. A sufficient period of time would then be 
allowed to pass for infection to occur, following which 
the presence or absence of viral replication would be 
determined versus untreated control cells by methods 

^ known to those of ordinary skill in the art. Such 

methods include, but are not limited to, the detection 
of viral antigens in the cell, for example, by 
immunofluorescence procedures well known in the art; the 
15 detection of viral polypeptides by Western blotting 
using antibodies specific therefor; the detection of 
newly transcribed viral RNA within the cells by RT-PCR; 
and the detection of the presence of live, infectious 
virus particles by injection of cell culture medium or 

20 

cell lysates into healthy, susceptible animals, with 
subsequent exhibition of the signs and symptoms of HCV 
infection. A comparison of results obtained for control 
cells (treated only with nucleic acid sequence) with 
25 those obtained for treated cells (nucleic acid sequence 
and antiviral agent) would indicate, the degree, if any, 
of antiviral activity of the candidate antiviral agent. 
Of course, one of ordinary skill in the art would 
readily understand that such cells can be treated with 

30 

the candidate antiviral agent either before or after 
exposure to the nucleic acid sequence of the present 
invention so as to determine what stage, or stages, of 
viral infection and replication said agent is effective 
35 against. 
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In an alternative embodiment, viral enzyme 
such as NS3 protease, NS2-NS3 protease, NS3 helicase or 
NS5B RNA polymerase may be produced from a nucleic acid 
sequence of the invention and used to screen for 
5 inhibitors which may act as antiviral agents. The 

structural and nonstructural regions of the HCV genome, 
including nucleotide and amino acid locations, have been 
determined, for example, as depicted in Houghton, M. 
(1996), Fig. 1; and Major, M.E. et al. (1997), Table 2. 

Such above-mentioned protease inhibitors may 
take the form of chemical compounds or peptides which 
mimic the known cleavage sites of the protease and may 
be screened using methods known to those of skill in the 
art (Houghton, M . (1996) and Major, M.E. et al. (1997)). 
For example, a substrate may be employed which mimics 
the protease' s natural substrate, but which provides a 
detectable signal (e.g. by fluorimetric or colorimetric 
methods) when cleaved. This substrate is then incubated 
with the protease and the candidate protease* inhibitor 
under conditions of suitable pH, temperature etc. to 
detect protease activity. The proteolytic activities of 
the protease in the presence or absence of the candidate 
inhibitor are then determined. 

In yet another embodiment, a candidate 
antiviral agent (such as a protease inhibitor) may be 
directly assayed in vivo for antiviral activity by 
administering the candidate antiviral agent to a 
chimpanzee transfected with a nucleic acid sequence of 
the invention or infected with a virus of the invention 
and then measuring viral replication in vivo via methods 
such as RT-PCR. Of course, the chimpanzee may be 
treated with the candidate agent either before or after 
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transfection with the infectious nucleic acid sequence 
or infected with a virus of the invention so as to 
determine what stage, or stages, of viral infection and 
replication the agent is effective against. 
5 The invention also provides that the nucleic 

acid sequences, viruses and polypeptides of the 
invention may be supplied in the form of a kit, alone or 
in the form of a pharmaceutical composition. 

All scientific publication and/or patents 
cited herein are specifically incorporated by reference. 
The following examples illustrate various aspects of the 
invention but are in no way intended to limit the scope 
thereof.. 

EXAMPLES 
Materials and Methods 

Source of HCV 

An infectious plasma pool of HCV genotype 2a 
(HC-J6 C h) prepared from acute phase plasma of* a 
chimpanzee experimentally inoculated with plasma from a 
Japanese patient infected with strain HC-J6 (Okamoto et 
al., 1991) was used for cloning. An infectious cDNA 
clone of HCV strain H77, genotype la was also used 
(pCV-H77C; Yanagi et al., 1997). 

Amplification, cloning and sequence analysis 

Viral RNA was extracted from 100 fil aliquots 
of the HC7J6CH plasma pool with the TRIzol system 
(GIBCO/BRL) (Yanagi et al., 1997). Primers used in cDNA 
synthesis and PCR amplification were based on the 
genomic sequence of strain HC-J6 (Okamoto et al., 1991) 
and from the conserved region (3'X) of the 3* UTR of HCV 
genotype 2a (Tanaka et al., 1996) (Table 1). The RNA 
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was denatured at 65°C for 2 min, and cDNA was 
synthesized at 42°C for 1 hour with Superscript II 
reverse transcriptase (GIBCO/BRL) and specific reverse 
primers in 20 [il reaction volumes. The cDNA mixtures 
were treated with RNase H and RNase Tl (GIBCO/BRL) at 
37°C for 20 min. 

TABLE 1 



Oligonucleotides used for amplification and cloning 
of strain HC-J6 C h, genotype 2a 



Designation 


Sequence (5 f 3')a 


2427S-H77 


ACTGGACACGGAGGTGGCCGCGTC 


2426S-H77 


TTGTTCTTGTCGGGTTAATGGCGC 


2 64 5R-H77 


GGGTGTACTACACACATGAGTAAG 


2832R-H77 


AAGCGCCCCTAACTGATGATG 


H2751SII 


CGTCATCGMBkCCTCAGCGGGCATATGCACTGGACACGGA 


H2786R 


GTCCAGTGCATATGCCCGCTGAGG 


H2870R 


C AT GC AC CAGCTGATATAG CGCTTGTAAT ATG 


H7851S 


TCCGTAGAGGAAGCTTGCAGCCTGACGCCC 


H9140S (K) 


CAGAGGAGGCAGGGTGCTATATGTGGCAAGTAC 


H9173R (M) 


GTACTTGCCACATATAGCAGCCCTGCCTCCTCTG 


H9471R 


CGTCTCTAGAC AGGAAATGGCTT AAG AGGCCG GAGTGT TT ACC 


J6-H2556S 


T T ATGGATGCT CATCT TGTTGGG CC AGG CCGAAG CAGCTT TGGAG AACCTCGT AATACT 




CAATGC 


356RF-J6H 


AGG AT TTGTGCTC ATGG TGCACGGT CTACG AG 


is-J6F* 


TTTTTTTTCOGGCCGC TJl&TACGAC TCACTJLTAG3\C COGC CCCT AAT AGG 


333S-J6 


CCGTG CACCATGAGCACAAATCCTAAAC CTC 


7 53R-J6 


GGATGTACCCCATGAGGTCGGCAAAG 


254 3S-J6F 


GTTTGCGCCTGCTTATGGATGCTCATCTTG 


2787R-J6(26) 


GCGTCATAAGCATATGCCTGTTGGGG 


3329R-J6 


CCCTCAGCACTGGAGTACATCTG 


5487-J6F 


CGTCATGCATACCCCT AGGG CGGCT CTC ATTG AAGAGGG 


551BR-J6F 


CGT CCCCTCTT CAATGAGAGCCGCTCTAGA 


9251S-J6F 


GCGGTGAAGACCAAGCTCAAACTCACTC 


9305R-J6F 


AATCTAGAAGGCGCGCTTCCGGCAATGGAGTGAGTTTGAGC 


9310R-J6F 


CGTCTCTAX5AGGATAAATCCAGGAGGCGCGCTTCCGGC 


9399S-J6F 


TACTTTT TGTAGGGGTAGGCCTTTTCC 


9464-J6F 


CGTCTCTAGAGTGTAGCTAATGTGTGCCGCTCTA 


9470(24)-J6 


CTATGGAGTGTAGCTAATGTGTGC 


J6-3' XR 


C GTCTCTAGACATGAT C TG CAGAGAGAC CAGTT ACGGC ACTCTCTG FCAGT CATGCGGC 




TCACGGACCTTTCACAGCTAGCCGTGACTAGGGCTAAGATGGAGCCACC 



a HCV-specific sequences are shown in plain te*t, non HCV-specific 
sequences are shown in bold face, and cleavage sites used for cDNA 
cloning are underlined. 

b The core sequence of the T7 promotor is shown in italics. 



The strategy used to amplify and clone the 
full-length HC-J6 C h sequence is shown in Fig. 1. 
Nucleotide positions correspond to those of the 2a 



WO 00/75338 



PCT/DS00/15446 



10 



- 31 - 

infectious clone (pJ6CF) that is described herein. The 
5* end of HC-J6 C h (nts. 17-297, excluding primer 
sequences) was amplified from 2 pi of cDNA synthesized 
with primer a-2 (Yanagi et ai., 1996), PCR was performed 
with AmpliTaq Gold DNA polymerase (Perkin-Elmer) as 
described previously (Yanagi et al., 1996) using primers 
1S-J6F and a-2. After purification, the amplified 
products were cloned into pGEM-T Easy vector (Promega) 
using standard procedures and 5 clones (pJ6-5'UTR) were 
sequenced. 

The 3 f end of HC-J6 CH was amplified in 3 
overlapping pieces. RT-PCR of a short fragment of NS5B 
(nts. 9279-9439) was performed with primers 9251S-J6F 
15 and 94 64R-J6F as described above. The PCR products were 
cloned into pGEM-T Easy vector and sequence analysis was 
performed from 5 pJ6-3*F clones, A second region 
spanning from NS5B to the conserved region of the 3 1 UTR 
(nts. 9376-9629) was amplified in RT-nested PCR 
(external primers H9261F and H3'X58R, internal primers 
H9282F and H3'X45R) . (Yanagi et al., 1997). The amplified 
products were cloned into pGEM-9zf (-) by using Hindlll 
and Xbal sites and 14 pJ6-3'VR clones were sequenced. 
25 The third fragment, which included the 3' terminal 
sequence was amplified with primers 9399S-J6F and 
J6-3'XR from one of the pJ6-3'VR clones, and cloned into 
one of the pJ6-3'F clones by using StuI and Xbal sites 
(pJ6-3'X) . 

The ORF of HCV HC-J6 CH was amplified by long 
RT-PCR in 3 overlapping pieces. The amplification was 
performed on 2 j*l of the cDNA mixtures with the 
Advantage cDNA polymerase mix (Clontech) (Yanagi et al., 
1997) . The J6S fragment (nts. 86-2761) was amplified 
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with primers a-1 (Yanagi et al., 1996) and J6-2787R from 
cDNA synthesized with primer J6-3329R. A single PCR 
round was performed in a Robocycler thermal cycler 
(Stratagene) , and consisted of denaturation at 99°C for 
35 sec, annealing at 67°C for 30 sec and elongation at 
68°C for 4 min 30 sec during the first 5 cycles, 5 rain 
during the next 10 cycles, 5 min 30 sec during the 
following 10 cycles and 6 min during the last 10 cycles. 
The J6B fragment (nts. 2573-5488) was amplified with 
primers 2543S-J6F and 5518R-J6F from cDNA synthesized 
with primer 5518R-J6F. Finally, the J6A fragment (nts. 
5515-9282) was amplified with primers 5487S-J6F and 
9310R-J6F from cDNA synthesized with primer 
9470R(24) -J6F. PCR amplifications of fragments J6B and 
J6A consisted of denaturation at 99°C for 35 sec, 
annealing at 67°C for 30 sec and elongation at 68°C for 6 
min during the first 5 cycles, 7 min during the next 10 
cycles, 8 min during the following 10 cycles. and 9 min 
during the last 10 cycles. 

After purification of the long PCR products 
with QIAquick PCR purification kit (QIAGEN) , A-tailing 
reactions were performed with AmpliTaq DNA polymerase 
(Perkin Elmer) at 72 °C for 1 hour. The gel-purified 
A-tailed PCR products were cloned into pCR2.1 vector 
(Invitrogen) or pGEM-T Easy vector ( Pr omega ) . DH5-alpha 
competent cells (GIBCO BRL) were transformed and 
selected on LB agar plates containing 100 ng/ml 
ampicillin (SIGMA) and amplified in LB liquid cultures 
at 30°C for 18 - 20 hrs (Yanagi et al. t 1997). Midiprep 
was performed using Wizard Plus Midipreps DNA 
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Purification System (Promega) , Multiple clones of the 
J6S, J6A and the J6B fragments were sequenced. 

The consensus sequence of strain HC-J6 C h (nts. 
17-9629) was determined by direct sequencing of PCR 
products (nts. 297-3004 and nts. 4893-5762) and by 
sequence analysis of the TA clones (nts. 17-5488 and 
nts. 5515-9629) (Fig, 1). Both strands of DNA were 
sequenced in all cases. Analyses of genomic sequences, 
including multiple sequence alignments and tree 
analyses, were performed with GeneWorks (Oxford 
Molecular Group) (Bukh et al., 1995). 

Construction of chimeric cDNA clones of genotypes la & 
2a 

Four full-length intertypic chimeric cDNA 
clones were constructed (Figs. 4, 5A, 5B) . In each clone 
the C, El and E2 genes encoded the consensus amino acid 
sequence of HC-J6 C h- The p7 protein was encoded either by 
the HC-J6 C h or pCV-H77C consensus sequence, and the NS 
20 proteins were all encoded by pCV-H77C genes. To 

engineer these cDNA clones, an Ndel site from pCV-H77C 
was first eliminated by a silent substitution (C to T) 
at position 9158. In brief, two fragments were 
amplified from pCV-H77C with primers H7851S and 
H9173R(M) and with primers H9140S(M) and H9417R (Table 
3), gel-purified and used for fusion PCR with primers 
H7851S and H9417R. The fusion PCR products were cloned 
into pCV-H77C by using Hindlll and Aflll sites. A new 
30 artificial Ndel site was introduced by a silent 

substitution (C to T) at position 2765. PCR products, 
which were amplified from pCV-H77C with primer H2751SII 
containing artificial Cial and Ndel sites and primer 
H2870R, were cloned into the modified pCV-H77C by using 
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Clal and Eco47IlI sites. The final construct (pH77CV) 
was used as a cassette vector to construct the 
intertypic chimeric HCV cDNA clones . 

The four chimeric cDNA clones were constructed 
5 as follows. pH77CV-J6S (nucleotide sequence shown in 

SEQ ID No: 3 and amino acid sequence shown in SEQ ID 
No:4): The Agrel/Bsjnl fragment of clone J6S2 and the 
BsmI/ Ndel fragment of clone J6S1, were cloned into 
pH77CV by using Agel and Afcfel sites; pH77 (p7)CV-J6S 

10 

(nucleotide sequence shown in SEQ ID No: 5 and amino acid 
sequence shown in SEQ ID No: 6): A fragment of pH77CV-J6S 
was replaced with a fragment amplified from pCV-H77C 
with primers J6-H2556S and H2786R by using BsaBI and 
15 Ndel sites; J6S (nucleotide sequence shown in SEQ ID 

No: 7 and amino acid sequence shown in SEQ ID No: 8): A 
fragment amplified from pH77pCV-H77C with primers a-1 
and 356RF-J6H77 and another fragment amplified from 
pH77CV-J6S with primers 333S-J6 and 753R-J6 were 
gel-purified and a fusion-PCR was performed with primers 
a-1 and 753R-J6. The Agel/Clal fragment of the 
subcloned fusion PCR products and the Clal/ Ndel fragment 
of pH77CV-J6S were cloned into pH77CV-J6S by using Agel 
and Afdel sites; pH77(p7)-J6S (nucleotide sequence shown 
in SEQ ID No: 9 and amino acid sequence shown in SEQ ID 
No:10): The Agel/Clal fragment of J6S and the Clal/Ndel 
fragment of (p7)CV-J6S were cloned into pH77 (p7 ) CV- J6S 
by using Agel and Ndel sites. 

Each intertypic chimeric cDNA clone was 
retransf ormed to select a single clone, and large-scale 
preparation of plasmid DNA was performed with a QIAGEN 
plasmid Maxi kit as described previously (Yanagi et ai., 
1997) . Each of the four cDNA clones was completely 
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sequenced before inoculation. Each clone was 
genetically stable since the digestion pattern was as 
expected following retransf ormation and the complete 
sequence was the expected one. 

Construction of full-length cDNA clone HC-J6 CH 

An overview of the full-length HOJ6 C h clone is 
presented in Fig. 1. In the final construct pJ6CF, 
which encodes the consensus polyprotein of HC-J6 CH , an 
Xbal site was eliminated by a silent substitution (A to 
G) at position 5494. Digested fragments containing the 
consensus sequence were purified from the appropriate 
subclones and ligated using the sites indicated. The 
full-length cDNA clone (pJ6CF) was retransf ormed to 
select a single clone, and large-scale preparation of 
plasmid, DNA followed by the complete sequence analysis 
was performed. Clone pJ6CF was genetically stable. 

Intrahepatic transfection of chimpanzee with transcribed 
20 RNA 

In duplicate 100 |il reactions, RNA was 
transcribed In vitro with T7 RNA polymerase (Promega) 
from 10 \xg of template plasmid linearized with Xbal 
25 (Promega) as described previously (Yanagi et al., 1997). 
The integrity of the RNA was checked by electrophoresis 
through agarose gel stained with ethidium bromide 
(Yanagi et al., 1997). Each transcription mixture was 
diluted with 400 jil of ice-cold phosphate-buffered 
saline without calcium or magnesium and then immediately 
frozen on dry ice and stored at -80°C. Within 24 hours, 
both transcription mixtures were injected into the same 
chimpanzee by percutaneous intrahepatic injection guided 
35 by ultrasound (Yanagi et a J . , 1998, 1999). If the 
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chimpanzee did not become infected, the same 
transfection was repeated once. After two negative 
results, the next clone was inoculated into the same 
chimpanzee following the same protocol. Injections were 
5 performed at weeks 0 and 2 with pH77CV-J6S, at weeks 5 

and 8 with pH77 (p7 ) CV- J6S, at weeks 14 and 16 with 
PH77-J6S, at weeks 19 and 23 with pH77 (p7) -J6S, at week 
28 with.pJ6CF, and finally at week 34 with pCV-H77C. 
The chimpanzee was maintained under conditions that met 
or exceeded all requirements for its use in an approved 
facility. 

Serum samples were collected weekly from the 
chimpanzee and monitored for liver enzyme levels by 
15 standard procedures, anti-HCV antibodies by the 

second-generation ELISA (Abbott) and HCV RNA by a 
sensitive RT-nested PGR assay with AmpliTaq Gold DNA 
polymerase using primers from the. 5' UTR (Yanagi et ai., 
1996) . Samples were scored as negative for HCV RNA if 

20 

two independent tests on 100 |il of serum were negative . 
The genome equivalent (GE) titer of HCV in positive 
samples was determined by RT-nested PCR on 10-fold 
serial dilutions of the extracted RNA (Bukh et ai,, 

25 1998). The consensus sequence of the complete ORF from 

the chimpanzee infected with RNA transcripts of pJ6CF 
was determined by direct sequencing of overlapping PCR 
products obtained by long RT-nested PCR as previously 

}0 described (Yanagi et al., 1997) with HC-J6 specific 

primers. After the intrahepatic transfection with RNA 
transcripts of pCV-H77C, we performed H77 (genotype la)- 
specific RT-nested PCR with primers 2427S-H77 and 
2832R-H77 for the 1st round and with primers 2462S-H77 

15 and 2645R-H77 for the 2nd round (Table 3) . The 
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sensitivity of this assay was equivalent to that of the 
assay using 5* UTR primers when testing serum containing 
only H77, genotype la- The genome titer of genotype la 
was determined by using this specific RT- nested PCR on 
10-fold serial dilutions of the extracted RNA. 

EXAMPLE 1 

Sequence analysis of HCV strain HC-J6 C h 

As minor deviations from the consensus amino 
acid sequence were found previously to render 
full-length HCV cDNA clones noninfectious (Yanagi et 
al., 1997, 1998), the consensus sequence of the cloning 
source of genotype 2a (strain HC-J6 C h) was determined 
15 prior to constructing any full-length clones. In brief, 
a plasma pool containing strain HC-J6 C h was prepared from 
acute phase plasmapheresis units collected from a 
chimpanzee experimentally infected with HC-J6 (Okamoto 
et al., 1991). The HCV genome titer of this pool was 
10 5,4 genome equivalents (GE)/ml (Quantiplex HCV RNA 
bDNA 2.0, Chiron) and the infectivity titer was 10 4 
chimpanzee infectious doses /ml. 

The consensus sequence of the 5' UTR of HC-J6 CH 
(nts. 17-340) was deduced from 5 clones containing nts. 
17-297 and 8 clones containing nts, 86-340. The 5 f UTR 
of the various clones was highly conserved, but the 
consensus sequence of HC-J6 C h differed by 2 nucleotides 
from that published previously for HC-J6 (Okamoto et 
al., 1991: C to T at position 36 and T to C at position 
222) ♦ 

The consensus sequence of 14 clones of the 3' 
UTR of HC-J6 CH indicated that the 39 nucleotide long 
variable region was highly conserved in this strain and 
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was identical to that previously published for HC-J6 
(Okamoto et al., 1991). The polypyrimidine tract varied 
greatly in length (84-164 nucleotides), and contained 
some conserved A residues. In the conserved region, the 
5 proximal 16 nucleotides were identical to those 

previously published for isolates of different HCV 
genotypes (Kolykhalov et al., 1996; Tanaka et al., 1996; 
Yamada et al., 1996). The remaining 82 nucleotides of 
the conserved region were determined for other genotype 
2a strains (Tanaka et al., 1996) but not for HC-J6 or 
HC— J6ch • 

The ORF of HC-J6 C h was amplified in 3 fragments 
by RT-PCR (Fig. 1). Eight clones of the J6S fragment 
(nts. 86-2761), 6 clones of the J6B fragment (nts. 
2573-5488) and 6 clones of the J6A fragment (nts. 
5515-9298) were sequenced. PGR fragments containing 
nts. 5489-5514 were sequenced directly. A quasispecies 
was found at 243 nucleotide (2.7%) and 69 amino acid 
(2.3%) positions, scattered throughout the 9099 nts 
(3033 aa) of the ORF. However, the majority, 231 
nucleotide substitutions, were detected only once and 
71.6 % of these represented silent mutations. The 12 
remaining nucleotide substitutions were each restricted 
to 2 clones and only 4 of these resulted in amino acid 
changes. The nucleotide difference among the J6S clones 
ranged from 0.1 - 1.3%, among the J6B clones it ranged 
from 0.1 - 0.3%, and it ranged from 0.2 - 4.0% among the 
J6A clones (Fig. 2). Three of 8 J6S clones, 4 of 6 J6B 
clones, and all 6 J6A clones had defective polyproteins 
due to nucleotide deletions, insertions or 
substitutions . 
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The sequences of clones of strain HC-J6 CH were 
relatively homogeneous. This was highlighted by the 
high degree of conservation among clones of the HVR1 
(Fig* 3), a region frequently used to study the 
5 quasispecies of HCV (Bukh et al., 1995). An exception 

was the sequence of clone J6A1, which differed by about 
4% from the other clones of this region (Fig. 2) , 
Importantly, the consensus sequence of strain HC-J6 CH 
(nts. 17-9629) could be determined with no ambiguity at 
the nucleotide or deduced amino acid level. The 
difference between the consensus ORF sequence of HOJ6 CH 
from the experimentally infected chimpanzee and that of 
HC-J6 of the inoculum (Okamoto et al., 1991) was 4.1 % 
and 2.2 % at the nucleotide and deduced amino acid 
levels, respectively (Fig. 2, Table 2). Moreover, we 
found that 12 (44.4%) of the 27 amino acids constituting 
HVR1 differed between HC-J6 CH and HC-J6 (Fig. 3)* Such 
diversities are greater than the < 2 % generally 
considered to comprise a quasispecies* In fact, these 
differences are equivalent to those found between the 
two prototype strains of HCV genotype la [strains HCV-1 
(Choo et al., 1991) and H77 (Yanagi et al., 1997)). 
These results indicated that HC-J6 CH , which represented 
the major species in the experimentally infected 
chimpanzee, was a minor species in the original 
inoculum. 
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TABLE 2 
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20 
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35 



Percent difference of nucleotide and predicted amino acid sequences 
between strain HC-J6 (Okamoto et al., 1991) and strain HC-J6a, from 



Genome Region 


nt .position* 


% nt. 


difference 


% a. a 


. difference 


ORF 


341-9439 


4 . 1 


(373/9099)* 


2.2 


(66/3033)* 


5' UTR 


17-340 


0.6 


(2/324) 






Core 


341-913 


0.5 


(3/573) 


0 (0/191) 


El. 


914-1489 


4.3 


(25/576) 


2.1 


(4/192) 


HVR1 


1490-1570 


24 .7 


(20/81) 


44.4 


(12/27) 




1571-2590 


3.9 


(40/1020) 


3.2 


(11/340) 


P7 


2591-2779 


3.7 


(7/189) 


3.2 


(2/63) 


NS2 


2780-3430 


4.0 


(26/651) 


2.8 


(6/217) 


NS3 


3431-5323 


4.0 


(76/1893) 


0.8 


(5/631) 


NS4A 


5324-5485 


4.3 


(7/162) 


1.9 


(1/54) 


NS4B 


54 8 6-6268 


3.7 


(29/783) 


0.4 


(1/261) 


NS5A 


6269-7666 


5.4 


(75/1398) 


3.4 


(16/466) 


NS5B 


7667-9439 


3.7 


(65/1773) 


1.4 


(8/591) 


3' UTR 


9440-9481 


0 (0/42) 







a The nucleotide positions correspond to those of the infectious 

full-length genotype 2a clone (pJ6CF) . 
b The numbers in parenthesis indicate the nucleotide or amino acid 

differences for each region. 

Example 2 

Chimeric molecular clones 

As chimeric flaviviruses with substituted 
structural genes have been useful in defining the 
biological function of viral sequences or proteins, in 
analyzing immune responses and in generating attenuated 
vaccine candidates (Bray and Lai, 1991; Chambers et al., 
1999; Pletnev et aJ., 1992, 1993, 1998). The consensus 
sequence of the 2a structural genes and surrounding 
region was substituted for that of the infectious la 
cDNA clone* In the genotype la backbone, two silent 
mutations were introduced for cloning purposes [at 
positions 2765 (p7) and 9158 (NS5B) of pCV-H77C] (Fig. 
4). The complete sequence of each chimera was verified. 
Infectivity of RNA transcripts from four different 
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intertypic chimeric clones (Figs. 4, 5A, 5B) was evaluated 
by consecutive intrahepatic transf ections of a chimpanzee. 
Clones were considered not to be viable if viral RNA was not 
detected in the serum within two weeks of the repeat 
transfection. All chimeric clones contained the C, El and E2 
genes of genotype 2a. The two chimeric clones tested 
initially differed from each other in that one had the p7 
gene of 2a (pH77CV-J6S) and the other [pH77 (p7 ) CV- J6S] the 
p7 gene of la. They differed from the two other clones in 
that the 186 nucleotides of the 5' UTR just upstream of the 
initiation codon were from the 2a genotype. Since neither 
clone containing the chimeric 5' UTR was infectious, the 
chimeric 5' UTR was replaced with the consensus genotype la 
5* UTR to generate the two p7 varieties [pH77-J6S and 
15 pH77 (p7) -J6S] . After consecutive transfection of the four 
clones, no HCV RNA, anti-HCV or ALT elevation was detected 
in the chimpanzee during 28 weeks of follow-up, suggesting 
that RNA transcripts from these intertypic chimeric clones 
were not viable in vivo. 

This finding that the intertypic clones between 
genotypes la and 2a were not viable was surprising since 
flavivirus chimeras containing the structural region of 
dengue virus type 1 or 2 or of tick-borne encephalitis virus 
25 and the nonstructural region of an infectious dengue type 4 
virus were viable (Bray and Lai, 1991; Pletnev et al., 1992, 
1993) . While considerable sequence variation exists between 
the infectious genotype la and 2a clones of HCV (Table 3), 
these viruses exhibit a higher degree of genetic 
heterogeneity than do the major genotypes of HCV. For other 
f laviviruses, however, it was possible to obtain 
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infectious chimeric clones only if the capsid region was 
derived from the backbone cDNA clone (Chambers et al., 
1999; Pletnev and Men, 1998). 



TABLE 3 



10 



15 



20 



Percent difference of the amino acid sequences between 
the infectious clone of genotype la (DCV-H77C; 
Yanagi efc al, t 1997) and the infectious clone of 



Genome Region* 


% difference 


Polyprotein 


27. 9 (839/3007)" 


Core 


8.9 (17/191) 


El 


37.0 (71/192) 


HVRl 


59 , 3 (16/27) 


E 2— hvri 


27.1 (91/336) 


P7 


38.1 (24/63) 


NS2 


41.9 (91/217) 


NS3 


19.2 (121/631) 


NS4A 


33 * 3 (18/54) 


NS4B 


26.8 (70/261) 


NS5A 


38 .5 (171/444) 


NS5B 


25.2 (149/591) 



a Genome regions defined as in Table 1. 

jb The numbers in parenthesis indicate the amino 

acid differences for each region. 

Positions with deletions or insertions in E2 (4 

aa positions) and NS5A {26 aa positions) were 

not considered. 
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Trivial explanations may account for the lack 
of viability of these intertypic chimeras. First, the 
two silent mutations introduced in the genotype la 
backbone (one in p7 and one in NS5B) for cloning 
purposes could potentially eliminate infectivity. This 
is, however, very unlikely since mutations at these 
positions exist among field isolates of HCV including 
strain HC-J6 C h (Bukh et al., 1998). Also, it is 
noteworthy that the three previously published 
infectious clones of strain H77 had numerous silent 
nucleotide differences (Hong et al., 1999; Kolykhalov et 
al., 1997; Yanagi et al., 1997). Second, signal 
peptidases might not cleave the chimeric E2/p7 or p7/NS2 
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junction. This seems unlikely, however, since 
eukaryotic signal peptidases typically recognize the 
amino acid sequences upstream of the cleavage site [the 
(-3, -1) rule] (Nielsen et al., 1997) and the amino 
acids at these two sites are conserved between genotypes 
la and 2a (Fig. 5B) . Finally, the E2/p7 and/or p7/NS2 
gene junctions could differ between genotypes la and 2a. 
The junctions determined for genotypes la and lb were 
used (Lin et al., 1994/ Mizushima et al., 1994; Selby et 
al., 1994) because those for genotype 2a have not been 
identified. In the latter two cases, further analyses 
of genotype 2a should eventually provide sufficient data 
to overcome such potential problems and it would most 
likely be possible to construct a viable chimera. 
15 More complicated explanations for the lack of 

viability of the chimeras might be required if critical 
genotype-specific interactions occur as regards the 
structural proteins, the nonstructural proteins and the 
genomic RNA. For instance, one cannot rule out that the 
chimeras were not viable because the IRES function was 
compromised. In in vitro studies the IRES activity 
depended on RNA sequences not only in the 5' UTR but 
also extending 3' of the translation initiation site 
25 (Hahm et al., 1998; Lemon and Honda, 1997; Reynolds et 
al., 1995). Although the 3' border of the HCV IRES is 
still controversial it is believed to involve at most 
the first 39 nts of the core gene (Lemon and Honda, 
3Q 1997) . The 5* UTR of the intertypic chimeras was either 
a chimera of genotype la and 2a sequences or the entire 
5' UTR was derived from the la clone (Figs. 4, 5A) . 
Importantly, the 5' end of core is conserved among 
genotypes la and 2a (Fig. 5A) . Thus, the predicted 

35 
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IRES-like secondary structure is maintained in these 
chimeras, suggesting that the IRES activity most likely 
was maintained. 

Possible interactions between the structural 
5 proteins and the nonstructural proteins and/or the 

genomic RNA, which involve RNA packaging , replication or 
translation are conceivable. In poliovirus, which is 
another positive-sense RNA virus, functional coupling of 
RNA packaging to RNA replication and of RNA replication 

10 

to translation have been suggested (Novak and 
Kirkegaard, 1994 ; Nugent et al., 1999). Similar to 
other viruses of the Flavivirldae family, a membrane- 
associated replicase complex is thought to initiate 
15 replication at the 3' end of HCV and to synthesize a 
complementary negative-strand RNA (Rice, 1996) . The 
putative cis-acting elements at the 5' and 3' termini 
which are believed to be important for viral genome 
replication (Rice 1996; Frolov et al., 1998) should be 

20 

maintained in the xntertypic HCV chimeras at' least in 
the two constructs with the authentic la 5'UTR. 
However, it is conceivable that the viral packaging 
system was interrupted (Frolov et ai., 1998), Studies 
25 using a Kunjin flavivirus replicon system and providing 
the structural proteins in trans suggested that the 
essential encapsidation signals did not reside in the 
structural region of the genome (Khromykh et al., 1997, 
1998) . The location of the packaging signals of HCV is 
not known. However, if the structural proteins 
encapsidate viral RNA via genotype-specific sequences 
outside of the structural region, the chimeras would be 
unable to package the RNA and it might be extremely 
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difficult to construct viable chimeras between highly 
divergent strains . 

Example 3 

A consensus molecular clone of 
genotype 2a is infectious in vivo 

In order to prove that the genotype 2a portion 
used in the 4 intertypic chimeric cDNA clones indeed 
represented the infectious sequence, a consensus full- 
length cDNA clone of HC-J6 C h (pJ6CF) was constructed. 
The core sequence of the T7 promoter, a 5' guanosine 
residue and the full-length sequence of HC-J6ch (9711 
nts) were cloned into pGEM-9Zf vector using Notl/Xbal 
sites. Within the HCV sequence there were no deduced 
amino acid differences and only 4 nucleotide differences 
(at nucleotide positions 1822, 5494, 9247 and 9289) from 
the consensus sequence of HC-J6 CH as determined in the 
present study. The silent mutation at position 1822 was 
within the structural region and so was also- present, in 
the four intertypic chimeras. The 5' terminal 16 nts 
and the 3 f terminal 82 nts were deduced from previously 
published HCV genotype 2a sequences (Okamoto et al., 
25 1991, Tanaka et ai., 1996). The full-length cDNA clone 
of genotype 2a contained a 5 f UTR of 340 nts, an ORF of 
9099 nts encoding 3033 amino acids and a 3 1 UTR 
consisting of a variable region of 39 nts followed by a 
132 nucleotide-long polypyrimidine tract interrupted 
with 3 A residues and the 3* terminal conserved region 
of 98 nts. 

RNA transcripts from pJ6CF were injected into 
the same chimpanzee used for injection of the 4 
35 intertypic chimeras. The chimpanzee became infected at 
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the first attempt with an HCV titer of 10 2 GE/ml at week 

1 post inoculation (p.i.), and 10 3 -10 4 GE/ml during weeks 

2 to 6 p.i. The consensus sequence of PGR products of 
the complete ORF, amplified from serum obtained during 

5 week 5 p.i., was identical to the sequence of pJ6CF and 
there was no evidence of a quasispecies . Since RNA 
transcripts of this infectious genotype 2a clone were 
infectious in vivo, and it shared an exact sequence with 
the non-infectious intertypic chimeric clones, their 
failure to replicate must have been the result of 
incompatibilities between *the genotype la and 2a 
sequences . 

To confirm that the chimpanzee used was 
15 susceptible also to infection by genotype la, which 

comprised most of the intertypic chimeras, the 
chimpanzee was subsequently inoculated with RNA 
transcripts from the infectious genotype la clone 
(pCV-H77C) . Serum samples were tested in an 

20 

H77-specific RT-PCR assay to identify super-infection 
with genotype la. At week 1 p.i. the total HCV genome 
titer was 10 4 GE/ml and the H77-specific (la) genome 
titer was 10 2 GE/ml. The H77-specific genome titer 

25 increased to 10 3 GE/ml at week 2 p.i., and reached 10 4 
GE/ml during weeks 3-6 p.i. The consensus sequence of 
PCR products amplified with H77-specific primers at 
weeks 1-6 p.i. were found to be identical to that of 

3Q pCV-H77C. However, the direct sequences of PCR products 
amplified with the 5* UTR primers at weeks 1-2 after 
inoculation of pCV-H77C were identical to that of pJ6CF 
indicating that the 2a genotype was still present and 
represented the majority species. These experiments 

35 confirmed that the inability of the intertypic la, 2a 
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cDNA clones to infect the chimpanzee was not the result 
of protective immune responses in the chimpanzee but 
represented deficiencies intrinsic to the chimeras . 

Discussion 

5 

The published infectious cDNA clones of HCV 

represent the two most important subtypes of genotype 1 

(Hong et al., 1999; Kolykhalov et al., 1997; Yanagi et 

ai., 1997, 1998). However, 5 more major genotypes of 

HCV are recognized- In the above Examples, the 

infectivity of a cDNA clone of a second major HCV 

genotype was demonstrated. As in previous studies, the 

infectivity of RNA transcripts was demonstrated in vivo 

by intrahepatic transfection of a chimpanzee. This new 

infectious clone (pJ6CF) encodes the consensus 

polyprdtein of HCV strain HC-J6 C h, genotype 2a- Its 

encoded polyprotein differs from those of the infectious 

clones of genotypes la and lb by approximately 30% 

(Table 2) - Genotype 2 strains, in particular subtypes 

2a and 2b, have a worldwide distribution and important 

differences between genotypes 1 and 2 with respect to 

pathogenesis and treatment were indicated in previous 

studies. The availability of an infectious clone 

representing a second major genotype of HCV should 

permit new ways of studying the molecular biology and 

immunopathology of this important and genetically quite 

different human pathogen. 

The 5' and 3' UTRs of HCV are believed to be 

critical for viral replication, translation and viral 

packaging (Rice, 1996). The 5* 203 terminal nucleotides 

and the 3' 101 terminal nucleotides of the published 

infectious clones of genotypes la and lb were identical. 
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However, the sequences of UTRs of the genotype 2a clone 
differ from those of the genotype 1 clones. Overall, 
the 5* UTR of the genotype 2a clone has 17 nt 
differences and a single nucleotide deletion compared 
with the infectious clones of genotype la (Fig. 5A) • 
5 Five of these differences and the deletion are within 
the first 30 nucleotides, whereas the remainder are 
found within the predicted IRES structure. Differences 
also exist between the 3' UTR of the genotype 2a clone 
and the clones of genotype la (Fig. 5B) . The sequences 
of the variable region are very different. Recent study 
has shown this region is not critical for infectivity in 
vivo (Yanagi et al., 1999). Within the regions which 
are critical for infectivity in vivo (Yanagi et al., 

15 1999), the 132 nucleotide-long polypyrimidine tract of 
the genotype 2a clone has 3 unique A residues 
interspersed and the 3 f terminal conserved region of 98 
nts has 4 nt differences within the 3' terminal stable 

2Q stem-loop structure (Fig. 5B) (Kolykhalov et ai., 1996; 

Tanaka et ai., 1996). Since the 2a clone was infectious 
these sequence differences are apparently real and are 
compatible with infectivity. Further studies are 
required to determine whether these represent critical 

25 genotype-specific sequences. 
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WHAT IS CLAIMED IS: 



1. A purified and isolated nucleic acid 
molecule which encodes human hepatitis C virus of 
genotype 2a f said molecule capable of expressing said 
virus when transfected into cells . 

2. The nucleic acid molecule of claim 1, 
wherein said molecule encodes the amino acid sequence of 
SEQ ID NO: 2. 

3. The nucleic acid molecule of claim 2, 
wherein said molecule comprises the nucleic acid 
sequence of SEQ ID N0:1. 

4 . A DNA construct comprising a nucleic acid 
molecule according to claim 1. 

5. A DNA construct comprising a nucleic acid 
molecule according to claim 3. 



6. An RNA transcript of the DNA construct of 



claim 4 



25 



7. An RNA transcript of the DNA construct of 



claim 5. 



8. A cell transfected with the DNA construct 



of claim 4. 

30 9. A cell transfected with the DNA construct 

of claim 5. 



35 



10. A cell transfected with RNA transcript of 



claim 6, 
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11. A cell transfected with RNA transcript of 

claim 7. 

12. A hepatitis C virus polypeptide produced 
by the cell of claims 8 or 9* 

13. A hepatitis C virus polypeptide produced 
by the cell of claims 10 or 11. 

14. A hepatitis C virus produced by the cell 
10 of claims 8 or 9. 

15. A hepatitis C virus produced by the cell 
of claims 10 or 11. 

15 16. A hepatitis C virus whose genome 

comprises a nucleic acid molecule according to claim 1. 

17. A hepatitis C virus whose genome 
comprises a nucleic acid molecule according to claim 3. 



20 



25 



30 



18. A method for producing a hepatitis C 
virus comprising transfecting a host cell with the RNA 
transcript of claims 6 or 7 . 

19. A polypeptide encoded by a nucleic acid 
sequence according to claim 1. 

20. A polypeptide encoded by a nucleic acid 
sequence according to claim 3. 

21. The polypeptide of claim 19, wherein said 
polypeptide is selected from the group consisting of NS3 
protease, El protein, E2 protein or NS4 protein. 



35 



WO 00/75338 



PCTAJS00/15446 



- 57 - 

22- The polypeptide of claim 20, wherein said 
polypeptide is selected from the group consisting of NS3 
protease, El protein, E2 protein or NS4 protein. 

23. A method for assaying candidate antiviral 
5 agents for activity against HCV, comprising: 

a) exposing a cell containing the hepatitis 
C virus of claims 16 or 17 to the 
candidate antiviral agent; and 
10 b) measuring the presence or absence of 

hepatitis C virus replication in the cell 
of step (a) . 

24. The method of claim 23, wherein said 

15 replication in step (b) is measured by at least one of 

the following: negative strand RT-PCR, quantitative RT- 
PCR, Western blot, immunof luoresence, or infectivity in 
a susceptible animal. 

20 25. A method for assaying candidate antiviral 

agents for activity against HCV, comprising: 

a) exposing an HCV protease encoded by a 
nucleic acid sequence according to claims 
1 or 3 or a fragment thereof to the 
candidate antiviral agent in the presence 
of a protease substrate; and 

b) measuring the protease activity of said 
protease. 



25 



30 



35 



26. The method of claim 25, wherein said HCV 
protease is selected from the group consisting of an NS3 
domain protease, an NS3-NS4A fusion polypeptide, or an 
NS2-NS3 protease. 
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27. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 23. 

28. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 25. 

29. Antibody to the polypeptide of claim 19. 

30. Antibody to the polypeptide of claim 20. 

31. Antibody to the hepatitis C virus of 

claim 16. 

32 . Antibody to the hepatitis C virus of 

claim 17 . 

33. A method for determining the 
susceptibility of cells in vitro to support HCV 
infection, comprising the steps of: 

a) growing animal cells in vitro; 

b) transfecting into said cells the nucleic 
acid of claim 1; and 

c) determining if said cells show indicia of 
HCV replication. 

34. The method according to claim 33 , wherein 
said cells are human cells. 

35. A composition comprising a polypeptide of 
claim 19 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 

36. A composition comprising a polypeptide of 
claim 20 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 
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37 . A composition comprising a nucleic acid 
molecule of claim 1 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 
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OOCACAAOQC AGCTIGGACG TCATAIOGAT CIGCTIGIOG Q3AG0G0CAC 1150 

CL'IC'IUITOB GOOCICEftCG T3333GAOCT GTG333GICT GICTTIUriG 1200 

TIQ3TCAACT OTlTftULTlC '1CTCJ0CAG9C GCCACTO3AC GAOGCAAGAC 1250 

TGCAA TlG'i.T CIMCIKtOC O930CATKEA A0333TCATC QCAIG3CA1G 1300 

GGATATCATG AIGAACIG3T CXXCI3039C AG3GTIQSTG GTAQCICAQC 1350 

TOCIOOQGAT CXX3CAAGX A2CA2G3ACA TCATOQCIQ3 TOCTC30Q3 1400 

GGft Gl l X 'IGS OQQQCATAQC GimTlUlUC MG3IG333A ACTQ33CGAA 1450 

Q3TOCI G STR , GIQCTGCIQC TATTIG0333 03TCGAO303 GAAAOOCAOG 1500 

1CAO0G3333 Q3CAQCA033 ClUUJLTlUr TPSTCTOCIT 1550 

ACAGCAG30G QCAAGCAGAA CATOCAACIG ATCAACAOCA A0G3CAGTIG 1600 

QCACATCAAT AGCAQ330CT TGAATIGCAA TGAAAQQCTT AACACXQ3CT 1650 

GSTEAGCAGG G L'lL ' l'lL'JLA T CAACACAAAT TCAACIUTIC AGGCIGTOCT 1700 

GAGAGGTIG3 C O GCiaXiS ACX30CTTAOC GAUTi'lUDGC AtdGUL 'IUUULj 1750 

TOCIMCAGT TA1GGCAAGG GAAGOQGOCT CGAOGAAD3C GOCIBCXGCT 1800 

03CACTAC0C TOCAAGACCT T3IG3CA3TG '1Q0OCGCAAA GAQCOTCIGT 1850 

GQ00033TAT ATIGCTICAC TOQCAGGOOC GT03TG3IG3 GAAOGAGOGA 1900 



FIG. 6A 

SUBSTITUTE SHEET (RULE 26) 



WO 00/75338 



8/22 



PCT/USOO/15446 



H77C 



10 20 30 40 50 

1234567890 1 234567890 1234567890 1234567890 1234567890 



CAQ3T0Q33C GOQQCIfcCCr ACAQCTQ333 TGCAAATCAT A033A3X3ICT 1950 

TOGIOC TTftA, CAACAOCftGG CJCftOOXIGS GCAATIG3IT 033TIGIHX 2000 

TOGATCAACT CAACIGGATT CAOCAAAGIG TO0QSAGO3C CXXX.TIG1GT 2050 

CA2CGGAQ33 GTD3332AACA. ACAUL'l'lUCT CIQXCCACT GftTIOCTTOC 2100 

G3AWOIDC G3AAGOCTOV TACICID33T OGQQCICOQS TOOCIGGAXT 2150 

ACP£X2CM33? GCMQSIGGA. CEOXlJmT A3QCITIG3C A CUK1ULT1U 2200 

TACCATCAAT TCAAAGTCAG GATOEAOGrlG G3AG333IOG 2250 

AGCACAG9CT G3ftftG0Q30C TGCAACIG3A, CG03333O3A. A09C1GIGAT 2300 

CIGGAAGACA G3GACAGGIC OGAGCICAGC CXCTIGCIGC TGTOCAOCAC 2350 

ACAGIGGCAG GlULTlUUbT GITCTETCAC GftOXTOOCA. QOCTTOIQCA , 2400 

CXDGGOCICAT CCfiOC'lCCAC CAGAACATIG TGGAOSIGCA. GIBCTIUI&C 2450 

GGGGTBGGGT CAAGCATOQC GICCIG330C A3TAAGIG3G AG3B03I03T 2500 

TCTGCIGTIC CTIUroCrro CAGAOQOQOG CGTCIGCIOC TQCITCIG3A. 2550 

TGATCTTACT CATATCCCAA GO33AG30GG CITTGGAGAA, OTCGTAAIA. 2600 

CICAA3GCAG CA3CDCIG3C CG33ACGCAC GJJLLTlUlUr CLTlll ' llbT 2650 

GITCncroC TTTOOSIGCT A3CIGAAGGG 170310333X5 GG033AG333 2700 

TCP03CCCT CTAD33GATC TOGOCICI OC ILCIUC'IOCT ULlUULUl'lG 2750 

GCICAGQG33 CA3CAGQCACT GGACACG3AG G1G30CGO3T CXJIGIQQCQ3 2800 

cxsrrroriCTT gixxostba tggdqcigac tcigtoooca tateacaagc 2850 

GCI3ffiATCAG ClUo'lUCAIG T33IQ3CTIC PGUKCTTTCr GAOCAGM3IA 2900 

GAAQCGCAAC TOCACGIGTG Q Jl ' lUULUJU CICAACGTCC Q333393933 2950 

O3AIQ30GIC ASCrmTICA. TGIGIUmST /OO333A0C CIGGEKETIG 3000 

ACATCAGCAA ACI7CIGCIG GOCATCTICG GAOOOdTTO GATICTICAA 3050 

GOCAGTTIGC TTAAAGICOC CTCTTOGIG CGCETICAAG QULTlUiLUG 3100 

GATCIGGGQG CiaQOGQQSA, AGATftQ0033 AGGTCAT3»C GIGGAAA3Q3 3150 

CCATCA1CAA. GTEAG33333 CTEACTO32A ULTAilUlUJLA TAACCATCie 3200 

AGCCCIUTIC GAGACIG33C GCACAACX33C CIGOGAGATC TO330GIG3C 3250 

TGTQ3AACCA G ' lUb ' lLTILT COaGAATOGA GACCAAGCIC ATCACGTG3G 3300 

GGGCAGAIRC CGOGGO G IQC GGIGACATCA TCAAC333CTT UULULflL'lLT 3350 

GOQCG37033 GOOiGGAGAT A CIGCHQ33 OCAG0Q3BO3 GAATGGICIC 3400 

CAAGQ3GTGG AGGTIQCIG3 C G QG CA TCAC GGOGT3AOGOC CAQCAGAGGA 3450 

GRGGCCTOC r AG93IG1MA ATCADCfiGCC TCACIQ300G GGACAAAAAC 3500 

CAAGTGGAGG GIGAG3TOCA GAIOolG'l CA . ACiUCJLACOC AAAULT1CJCT 3550 

GGCAACGTGC ATCAA3G333 TA0GCIGGAC '1U1CTAOCAC GGGSOCQGAA. 3600 

OGAGGAQCAT CQCAIGADCC AA Q3GTOCTG ICAIOCAGAT GTAHMCAAT 3650 

GIGGAOCAAG AOCTIGIQ33 CTO3000GCT GCICAAGGTT OJGGCTCMT 3700 

GACADQCTGT AOCK3033CT OCTOSGAGCT TDtfXTOGflC AGGAGQCAGG 3750 

OOGA3GICAT TCOOGTOOQC OQQOGAGGTG AXAGCAGGGG TSVUJL'lUL'lT 3800 
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1O3COCJ0QGC CCATTPXTA. CTIGAAAG3C ' lUJlUJUUUa GI033CT3IT 3850 

G1GUUUUG0G G3ACAOG0G3 TO33XTKIT CAGQ900QCX3 GTGIGCaOOC 3900 

GK3GM3IQ3C TAAAQ333IG GACTTEATCC CTGIG3AGAA CCIAG33ACA 3950 

ACCATCAGAT OOOCGGIGIT CACG3ACAAC TOCICTOCAC CAGCAGiaCC 4000 

GCAGAGCTIC C&G3IGGGGC AOC'IGCATOC TOOCAOOGGC AGOTGEAftGA 4050 

QCAOCAAQjT C0093CIGOG TaaSCAGOOC AGGGCIBCAA OGIGITO3IG 4100 

CTQ^AODOCT CIGTIGCIGC AAC3CIG33C TTIGGIGCTr ACATOIGCAA 4150 

GG00CA1GGG GTIGATOCEA A!BflCAG3AC 033G3IGAGA ACAATIAOCA 4200 

CIT33CAQD3C CATCAD31AC TOCBOCIAOG GCAflGTTOCT T330GAO33C 4250 

QGGIGC'ICAG GAQSIQCrm TSACATAAm Ai ' l ' IUlUA OS AGTGOCACIC 4300 

CAOG3A3GCC ACATOCATCr TQ33CATO33 CACIGICdT GAOCAAGC3G 4350 

AGACIG033G GGOGAGACTG GTIGIGCTCG OCA CIULTA C OCC'lULUmC 4400 

1O0GICACIG TGTOOCATOC TAACATOGAG GAQSITOCIC T3IOCAOCAC 4450 

CGGAGAGATC CCCITITAQG GCAAGQCTAT CCGOCTOGAG GTGA1CAAG3 4500 

GGGGAAGACA TCTCALLUJL'JLC T3DCACTCAA AGAAGAAGIG CGACGAQCK: 4550 

QCD303AAGC 1GGTCG2ATT Q33CATCAAT GOTIGGQCT ACEAGOQGQ3 4600 

1CTIGAG3IG T C I GTCA TCC OGACCAGOQG CGAIGTIGTC GTCGTGTOGA 4650 

COGAIGCICT CAIGACIQ3C T1TAQ33QCG ACTTOGACIC 1GIGA3AGAC 4700 

1GCAACA0GT GIGTCACTCA GACAGIGGAT TICAQOCTIG AOCCTAQCTT 4750 

TAOCATIGAG ACAACCAGQC TCQQ0CAG3A TOCTGICTCC AGGACICAAC 4800 

GQQ033GCAG GACTGGCAGG GGGAAGOCAG GCATCTATAG A2TIGTGGCA 4850 

G0Q0QQ3AGC Gia.UlL.LUG CAIGTTOGAC TO3ICGGICC TCIGTGAGIG 4900 

CTATCACG33 UXIUIULTI ' GGTATCAGCT CAO3O0O3QC GAGACTACAG 4950 

TEAQ3CTACG AQCGTACATC AACACO0 CG G GULTIUULUI ' GIGCCAQ3AC 5000 

CAICTIGAAT TTTS3GAQGG CGICTTIACG GGOCICACIC AIAXAGATOC 5050 

GCACTnTTA 1C0CAGACAA AGCAGAGIGG GGAGAACTIT UJriALL'lUJ 5100 

TAG03IAGCA AGQCACCGIG 1GOGZCW333 CICA A GOOQC TCPO0C A TOG 5150 

1GGGAGCAGA TGIQ3AAGIG TTIGATOCGC CTTAAAOQCA GQCIOCKIGG 5200 

GOCAACACOC CIGCIAXACA GACIQ33Q3C TGTICAGAAT GAAGICAOOC 5250 

1GAOQCAO0C AAICAOCAAA TACATCAUGA CA3GCATCIC GGCOGAQCIG 5300 

GAG3TCGICA CGAGCACCIG UJIU- ' IUUIT OXXiJJJLlX: ' IQjl ' lUC ' lCT 5350 

G30CG0GIAT T3GCIGTCAA CAGGCIGGGT GGTCATAGTG G9CAG3AID3 5400 

' lUl'lUlLLOj GAA GG GG G CA ATDOaOCIG ACAGG3AGGT ' ll ' llTft OCEG 5450 

GAGnOGATC AGATO3AAGA GTGCIUICAG CACnAOOT ACATOGAGCA 5500 

AQQGATCATC CKX3CTGAGC AGTICAAGCA GAAQ300CIC G3XTOCTQC 5550 

AGA0QG03TC CCGCCATOCA GAGGTIA3CA CQX ' IGC ' l^T CCAGAOCAAC 5600 

TGGCAGAAAC TOGAGGTCIT TIGGGOGAAG CACAIGIGGA ATITCATCAG 5650 

TOQGATACAA TACITOQOGG GGCIGICAAC <JL ' 1UJL ' 1U>T AACUL03CCA 5700 
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TTOCTICftXr GKXGQL'JLTIT M2&2X3G03 TCACCM30CC ACEAAD3CT 5750 

GQOCAAAOOC TCCICTICAA CMM TO3 3 S G33IG33I03 C1UO0GAOCT 5800 

OjULaiCCCC Q3IG003Cm CIQOCrnGT 033100030 CIBGCTOQOS 5850 

CCQCCATCG3 CAG33TIOG& CT0333AAGG TDLlUb'lOA GKHCl'lOA 5900 

cmilATCOQS OQljQUG'lOX.' GQGAGCICTT GfEftOCATICA. PGKTCKK3N3 5950 

033IGAG3IC OaCTOCa OGS AGGROCIQSr CAKiL'lUC'JO OC033CATCC 6000 

1CTOQ3CT33 A XgJTSm CJ1UJJ1U1G3 TCTO09CW3C 6050 

CX332A03TTG GOOCGGQOGA. G3333CW3IG CWOOSATCA A03330AAT 6100 

AQCCITOXC TO0033333A AUJAIi.UJ.TIC CDCXZADQCAC UOJ1UUU33 6150 

AGAG0GA3GC A3D0G00CGC GICACIQOCA. TBCICAGCAG GCXGACIGE& 6200 

ACCCAGCICC TGAQ33GACT QCATCftGIOS ATAAQCimS ASTSEftCEAC 6250 

1CCATOCIOC Q3TroCIG 3C TAAQ33ACAT CI033ACI03 AaMOOGAGG 6300 

T3CIC3AG33A CTITAAGACC T3XTGAAM3 CCAAQCICAT GDCACAACIG 6350 

0CIO33ATIC OCrrroiGIC CIGOCAGGOZ GSSEKUOSS GQGflCTQQOS 6400 

A03AGA093Z ATT3VTOCACA CICGCT3XA CIGIGGAO^T GAGAICftCIG 6900 

GACATCTCAA, AAA0033A0G ATCAG3ATO3 T033IOCIBG GACCIGCaGS 6950 

AACAIGTO3A GTO33ACGIT OCCCATEAAC QCCTACAOCA O033OXCIG 6550 

TACICCCCTT OCIGwQOCGA. ACTATAAGIT OJLUJ1U1G3 AG33IGICIG 6600 

CAGAGGAAIA 0GIO3AGA3A AQ33333I03 G33ALTICCA. CTAQ5IKI05 6650 

GCTKIGAC3A CIGACAAICT TAAA1CCOOS TOOCAGAIOC CAI Q33333A 6700 

ATTITICACA GAATIQ3AGG Q33IGCX3CXT ACfiCAQSTTT Q OJUU UULTr 6750 

GCAft UULLTl * GCIQ03GGAG GAGGTCAICAT ACIUIAOGftG 6800 

TaOQQQSIQ3 OSICCC A ATT AOCTIQOGSG O303AAG33G AOGfraOOOCT 6850 

CTIGAGGICC AJGCTCfiCIG A2O0CIOQCA. T&TPACAQCA GAtJcUA*-OJ 6900 

QGftGAftQGTT G30GAGAQ3G TCAODOOCIT (J1MUJCCAG CIOCTO33CT 6950 

AGOCAGCIGT CCGC'ICC A JC 1CTCAAGGCA ACTIQCAOC33 GCAAOCKEG& 7000 

CnxCCIGAC GCCGAGCICA T3GAGGCBA OCTOCIGIQG AQ3CAGGftGA. 7050 

T330003CAA CATCAOCAG3 GriGAGICAG AGAACAAAGT GoT^grgg 7100 

GftCTCenOG ATCD3ZTlGr Q9CAGAQGAG GA1GAG0333 AQSIUTODGrP 7150 

ACCTOCAGAA ATXC1Q03 3A AG1CIQ32AG ATICGOOTS aXL'lULD3G 7200 

TCIO33330S GOOQGACIBC AAG000003C TfiGMGAC GI03AAAAAG 7250 

CCIGACIftOG AAOCAOCIUT GSIOCAIOGC U UUULU.THC CAOCTOC AaS 7300 

G'iCCCCICCT GIQOCIOaaC CTCGGAAAAA GXflWDQCSrrG uillilhIDQG 7350 

AATCAACCCT ATCEACIGCC TTO3C03AGC TIQOCAGCAA AAGITTIgC 7400 

AQCTOCICAA CTIO OG GC ftT TftOB3QQ G AC AATAOGACAA ggg ™^ 7450 

QCCCaXCCT TCTOGCIGOC OUULCGACTC OGAOGTIGAG iu,xki av-1T 7500 

OCATOCCCOC CCIOGAG333 GAG0CI0333 A3D0OGATCT CAGCGRDOTG 7550 

1CATO3TOGA CQoICAGIAG T333C3333AC A033AAGATO TOOIOK3CIG 7600 
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CTCAATCICT TAXTOCTOGA CAQGOGCACr G3ICAGOQ03 TOGGL'lULULj 7650 

AAGAACAAAA ACIGOOCATC AA09CACIGA. OCAACTCGIT OCE&CGXAT 7700 

CACAATCIQ3 'lUiXl'lUJAC CACTICAGQC AGIGCITOOC *AAG3CAGAA 7750 

GAAAGICACA. TTIGACAGAC TOCAAGTICr GGACAGOCAr TAOCAG3A03 7800 

1GCICAAQ3A. GGTCAAAQCA G033CPK 3 A AAGIGAAGGC TAA CTIUL ' IA 7850 

TOOGIKGAGG AAQZTTQ2AG CCIGAQ3CDC CCACAll'Il^G OCAAAIOCAA. 7900 

Grrroaaa T ggggcaaaag adstoosits ccatoocaga apqocobbg 7950 

CCCACATCAA. CIOOSIGIGS AAAGAULT1C TGGAAGACAG T3TAACAOCA 8000 

AIAGACACIA O CA ICATCGC CAAGAAOGAG GnTCCIGOG TTCAQQGIGA 8050 

GAAGG33GGT Q3TAAGOCAG CTOGflCICAT GGIGTTOOQC GA0CTG33OG 8100 

U GCCJGGICf J L G GGAGAAGATG GOCCTSmjG AOGIGGTT3V3 CAAGCTOOOC 8150 

CT3Q0QSIGA, TQGGAAGCIC CEA03GATIC CAATEftCICAC CAGGACAOGG 8200 

GGTTGAATIC CTOGIGCAAG OGTOGAAGIC CAAGAAGAOC COGA3QGGGT 8250 

TCIGGTATCA. TAOOOGCIGT TTIGACTOCA CAGICACIGA. GAGOGACATC 8300 

OGT3U33GA03 AQGCAATITA CCAA1GTIGT GAOCK33ACC COCAAQOCCG 8350 

OGTGQOCATC AAGTOOCICA CTGAGAG3CT TEATGTIQGG GGO CCICTIA 8400 

CCAATICAAG QGGSGAAAAC TGQ3XTEAOC GCAGGTGOOG CGGGAGCQGC 8450 

GEACIGACAA. CEAQCIGJIQG TAACAOOCIC ACTIGCTACA. 1CAAGGCQOG 8500 

G3CAG0GIGT GGAGOOGCAG GGCTOCAGGA. CIGCACCA3G C IQGIGI GIG 8550 

GC3GADGACTT AGTO3ITATC TGTGAAAGTG 033333TCCA GGAGGAQ3GG 8600 

GQGAGQCIGA. GAGCCTICAC G3AQQC3MG AQCAGGTACT OOGOXOJUC 8650 

GQ933A000C CCACAACCAG AATAOGACTT GGAGCTDOA ACATCA3GCT 8700 

CCIGC A ACGT GICAGIOGGC CAGGA0GGO3 CTGGAAAGAG GGTCEACiaC 8750 

CITAOQCGTG ACCCIBCAAC OOOOCTOQJG AGAGCOGC3GT G3GAGACAGC 8800 

AAGACACACT GZAGICAATT CCIG3CTAQ3 CAACATAATC AIGTTIGO0C 8850 

CCACACIG1G GGCGAGoATC AIACIGATCA. OOCATTIdT TMCXilUL'lU 8900 

AIAG0CAG33 A3CAGCTIGA. ACA GGCICIT AACIGTGAGA T C POG G AGC 8950 

CIGCIACIOC AXAGAAGCAC TOGATCIACC 'lUCAAICATT CAAAGACTOC 9000 

A3Q3QCTCAG CGCATITICA C1CCACAGTT AC1C1UCAGG TGAAATCAAT 9050 

AG93IG300G CATOQCICAG AAAACTIG33 GTOOOGOOCT 1GOGAGCTIG 9100 

GAGACAOGGG GOOGG G AGOG TCO30QCTAG GCTICIGTOC AGAQ3AGGCA. 9150 

GGGCIGOCAT ATGIQGCAAG TACCICTICA ACIGGQCAGT AAGAACAAAG 9200 

CICAAACICA CTOCAATAGC OJCXJOO'lQiC OGGCTGGACT T3T00QGTTG 9250 

GTICACGGCT GGCEACAGGG G3GGAGACAT TDtfECACAGC GIGTCICA2G 9300 

O0OGG000GG CIQGTICIGG TITXGOCIAC lOCTGCTOGC TOCAGG33IA 9350 

G3CATCIACC TCCIGOOCAA, OOGA1GAAGG TIGGGGTAAA CACIOGQGCC 9400 

1CTIAAG0CA TTTOGIGTIT TiTl'lTO'lT imTnTTT 'lTlTlL'l'l'l'l' 9450 

■mnncTT tcctticcit cnmna: Tricrrrnc ccncnrAA. 9500 
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TOgTOG CIO : A TCITft GOOC T&GICACQ32 TAQCIGIGAA. AQGIOOSIGA. 9550 
GCJCX3ZAIGAC TOCAGAGAGT QCIGftTRCTO GOCTCICIOC AGATCATOT 9599 

FIG. 6F 
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MS1NPKFQRK TKRNINRRFQ EVKFP3332I VGSVYLLPBR GPRU3VRKIR 50 

KISER9QERG PRQPIPKARR PB30WAQFG YFWPLYGNEG 03sJM3fKUSP 100 

RGSRPSWGPT DPFFRSRNDG KVIDTLTOGF AHM3YIELV GAFD93AARA 150 

IAH3VKVLED GVNY70X3NLP GCSFSIFLIA UflCLTVPAS AXQVPNS9GL 200 

YHVINDCPNS SIVYEAAEAI IHTPQCVFCV BB3NASRCWV AVIPTVArRD 250 

GKLFTTQLRR HTTXLVGSAT LCSALYVGOL CGSVFLV3QL FIFSPFRWT 300 

1QDCNCSIYP QCTIGHRMAW E&MMNWSPIA AUWAQLLKE PQftlMCMEAG 350 

AHWGVIAGIA VLNVLLLF2G VEAEIWIGG NftLKi'I!AGSLV 400 

GaLLTPGAKQN IQUNINSSW HINSIMHCN ESLNIGWLftG IFYQHKFNSS 450 

QCPERLASCK PISXBN39GL EEFPTOWHXP PRF03IVPAK 500 

SVCGFVYCFT PSPWVGTID BSGAFTYSWG ANDIEVFVLN NIRPPLGNWF 550 

GCIVMNSIGF TKVOGAPFCV IGG\A3^NTLL CPTDCFRKHP E&TYSRG393 600 

PWrTEROWD YFYFJUWHYPC TINyiTFKVR MAA33VEHRL EAACNWIPGE 650 

RCDLEEHERS ELSPLLLSIT QWQVLFCSFT TLPALSTCLI HLH^NIVEWQ 700 

YLYGVGSSIA SWAIKWEXW LLFLLLAEAR VC9CUAM1LL ISQAEAALEN 750 

LVXLNAASLA GTB3LVSFLV FK2-AWYLK3 RWVPGAVXAL VTM/JPr.TT.T.T. 800 

IALFQRAYAL DIEV/AAS033 WLVGIJMALT LSFTCKRYIS VO^WDQifFL 850 

TKVEAQLHVW VPPLNVFQGR DKVILL&CW' HPTLVFDnK LLIAIPGPUW 900 

IDQASLLKVP YFVKVQ3LLR ICALARKIAG GHYV£J4AIIK D3ALTCIYVY 950 

NHLTPLREWA. HNGLRDLAVA VEPWFSFME IKLTIWGMJT AAOGDIIN5L 1000 

PVSARFGQEI LLGPADGMVS KGWRIXAPIT AYAQCfURGLL GCITTSLTGR 1050 

EKNQ7EGEVQ IVSTATQIFL ATCBSGVCWT VYHGAGITOT ASFKGPVIQM 1100 

YINVDQELVG WPAPQGSRSL TPCTOGSSEL VLVIRHAEWI PVRRRGDSRG 1150 

SLLSERPISY LKGS93GPLL CPAGHAVGLF RAKVCIPGWA KAVCFTPVEN 1200 

DGTIMRSPVF TENSSPPAVP QSFQVAHLHA. PTGSGKSIKV PAAYAAQGHK 1250 

VLVLNPSVAA l&GFGKiMSK AH3VDPNIFT GVRiTl'iUSP ITifSTTCKFL 1300 

AD33C933AY DIIICDE3CHS 1DATSILGIG TVLDQAEIAG AKLWLATAT 1350 

PPGSVIVSHP NIEEVALSTT GEIPFY3<AI PUEVIK3GKH LZFCHSKKKC 1400 

DELAAKLVAL GINAVAYYPG LOTSVIPTSG EWWS1DAL MiUb'lUUbDS 1450 

VUXNICVIQ TVDFSLDPIF HEITIL FQD AVSRTQRRGR TGR3CPGTYR 1500 

FVMGERPSG MFDSSVLCEC "YDAGCAWYEL TPAETIVRLR AYMNTPGLPV 1550 

OQEHLEEWEG VFIGLTHIEA HFLSQIHQ9G ENFPYLVAXQ ATVCARAQAP 1600 

PPSWDgyWKC LIRLKPTLH3 FTPLLYRLCA VQflEVTLTHP ITKYIMICMS 1650 

ADLEWISTW VLVGGVLAAL AAYCLSIGCV VIVGKEVL93 KPAHECKEV 1700 

LYQEFDEMEE C9QHLPYIBQ GMflLAEQFKQ KADSLXJQfTAS RHMVTTPAV 1750 

QINWQ^LEVF WAKHMrtNFTS GIQSflAGLST LPGNPAIASL MRFTAAVISP 1800 

LTIUQULLFN ILGGWs/AAQL AAPGftAXftFV GAGLAGAAIG SVK30GKVLVD 1850 

ILAGYGAGVA GALVAHCEMS GEVPSIEDLV NLLPATT SPG ALW3WCAA 1900 
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ILRRHVGPGE GAWQWMNFOjI AEASRGNHVS FIHYVFESQ\ AARVCVTLSS 1950 

LTVIQLLRRL HQWISSEETT PC93SWLHDI WDWICEVLSD FKTWLKAKLM 2000 

PQLPGIPFVS CQRGYFGVWR GDGIMHIPCH OSAETIGHVK N3TMRIV3ER 2050 

TCPJttWSGIF PINAYTIGPC TPLPAPNXKF ADWRVSAEEY VETRFWGDEH 2100 

YVSGMTECNL KCPOQIPSPE FTTEIjDGVRL HREAPFCKPL 2150 

LHEYPVGSQL PCEPEPIMAV LTS^LTDPSH nMMGRRL ARGSPPSMAS 2200 

SSASQLSAPS LKAICITiNHD SEDAELIEAN LUWRQEM33N TIRVESENKV" 2250 

VIIJDSFDPLV AEECEREVSV RAETTRKSRR FARALPVWAR PD¥NPPLVET 2300 

WKKPEWEPPV VH3CELPPPR SPPVPPPRKK RIWdESIL STALAELAIK 2350 

SFGSSST93I TGCNTTISSE PAPSQCPPDS EVESY524PP T.H^HJL)HJLi 2400 

SDGSWSTVSS GADIEDWDC SMStfSWIGftL VIPCAAEEQK LPINAL3JSL 2450 

LRHHNLVYST TSRSAOQRQK KVltLMJQVL DSHY£OVLKE VKAAASKVKA 2500 

NLLSVEEACS LTPFHSAKSK FGYGAKCVRC HARKAVAHIN SVWKDLLEDS 2550 

vrprornMA knevpcvqpe kqgrkparli vfpddsurvc ekmalyexa/s 2600 

KLPLAVM3SS TiGFQXSPGQR VEFLVQAWKS KKTOCFSXD TJOvDSTVTE 2650 

SDIKTEEAIY QCCEHLDPQAR VATKSLTEKL YVGGPLTNSR GENG3YRRCR 2700 

ASGVLTT90G NILTCYIKAR AACRAAGLQD CIMLVOGEOL WICESAGVQ 2750 

EDAASLRAFT EAMIRYSAPP GDPFQPEYDL ELIT9CSSNV SVAHDGAGKR 2800 

VYYLTRDPTT FLAFAAWETA RHTPVNSWLG NIIMEAPI1W ARMILMIHFF 2850 

SVLIARDQLE QALNCETK3A CYSIEPLDLP PIIQKLH3LS AFSUHSYSPG 2900 

EINRVAACLR KLGVPPURAW RHRARSVRAR LLSRGGRAAI OGKXLFNWAV 2950 

RTKLKLTPIA AAGRI2XSGW FTftGYSGSDI YHSVSHARPR WFWFCLLLIA 3000 

AGVGIYLLfN R 3011 
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G3CAGCC00C T3ATO333GC GACACIOCAC CA3GAATCAC TO00CIG1GA 50 

Q3AACIACIG TCITCAOQCA GAAAQOGTCT AQ0CATO3OG TEACTATCAG 100 

' lUH U lUCA G CCIOCAQGAC OQUUUL'lUC C Q35AGAQCrA Ta tilUJlClU 150 

QQGAAQOGGT GAGIACACOG GAATIGCCAG GAOGA003GG TOCITICTIG 200 

GATCAAOOCG CICAA3QQCT G3AGATCTOG QCGTGOCOOC GOGAGACIGC 250 

TAGOOGAGEA GT3TIQ33IC QOGAAAQQOC TIGTGGTACT GOC3GATAG3 300 

GTGC1TGQGA GTG0QCGQ93 AG3ICIGSBV GACOnGCAC CATCAGCA03 350 

7AT0CEAAAC CTCAAAGAAA AAOCAAAQGT AACACCAAOC GOOQOOCACA 400 

GGAOGTCAAG TIOQQG930G GIG3TCAGAT Q3TIGGTGGA GTTIAGCTGT 450 

TO0030GCAG GGGC00CAQ3 TIGGGIGIGC Q030GACEAG GAAGJLTiUC 500 

GAGOGSTOGC AAOCT0SIQ3 AAGQQGACAA GCEATOOCAA AGGCT0G00S 550 

AOOQGAGGGC AGGQGCIG35 CICAGQ00Q3 GTADOCTIGG GOQCICEATC 600 

GCAATSAGGG CCTGQGG1G3 GCAGGATOQC TOCIGTCAOC OCXXX33CIOC 650 

O3Q0CEAGTT G33OOQ0CAC GGAO0OO0QG GGTAGG1Q9C GTAACTIOGG 700 

TAAGGTCAIC GATACOCTTA CATOQGQCTT OGOOGATCIC ATO333TACA 750 

TIOGGCTOGT OGGOGCCaX 1 CTAG3GG90G C1GQCAG33C CTIGQCACAC 800 

GGIGTOOGGG 1TCIGGAQGA GG30GIGAAC TAIGCAACAG G3AA CTI G 0C 850 

OG3TIGCTCT TTCICEA3CT TQL'ILTIUQC 1CIGCIGTOC TGTTIGACCA 900 

TDOCAQCTIC GQCTEATGAA GTGOSCAACG T3TO0GQGAT ATAOCAOGIC 950 

ADGAAOGACT GCTOCAACIC AAGCATIGIG TATOAG9CAG 00GAO3IGAT 1000 

CATQCATACT C0Q333TQ03 TOGOCIGIGT TCAGGAQQGT AACAGCT00C 1050 

GTIGCTGGGT AGOQCICACT CCCA03CTCG OGGCCAGGAA TOOCAGOGIC 1100 

COCACTAOGA CAATAOGACG OCAOGTDGAC TIGCKDGTIG QGADQQCIGC 1150 

Tl ' lL ' lUL ' ll C GCTATGTADG T33GGGATCT CIGOGGATCT ATITIQCTCG 1200 

TCXCCCAGCT GTICAOCITC T0G0CTGG0C GGCA3GAGAC AGTGCAQGAC 1250 

TOCAACIQCT CAATCEATOC OO30CATOIA TCAGGICAOC GCA3G3CTT3 1300 

GGAIATGA'IG ATGAACIG3T CACCTACAAC AGOQCIAGIG GIGTOQCAGT 1350 

TOCKXX3GAT COCACAAGCT GT0GIGG ACA, TG3IGG0933 QQ00CACT93 1400 

QGAGroCIGG OQUULC'i'ia; CIALTATIUC A3GGTAG3SA ACIQ33CEAA 1450 

GGITCTGATT G l UUUU L ' lft C TCTTIGCOGG OGTIGAOGGG GAGAOJCACA 1500 

OGAO30G3AG GGIGGOOGGC CACAOCAOCT GOGGGTICAC GTOXTTTIC 1550 

TCAICIGGGG OGICICAGAA AATCCAGCTT GIGAATACCA ACGGCAGCIG 1600 

GCACATCAAC AGGACTGOOC TAAATTGCAA TGACTOOCIC CAAACIGG3T 1650 

TCTTIG003C GCIGrnTAC GCACACAAGT TCAACTCGIG O333IG0CDG 1700 

GAQ03CATOG CCAGCIGO0G CCQCATIGAC ' l Ub TlO ULUL' AGG G GIGG9S 1750 

CXXXATCACC TATACI?\AGC CIAACAGCIC GGATCAGAGG OCTEATIQCT 1800 

GQCATEA03C GOCTOGACOG ' 1U1GG1U1UG TACOOGCGIC GCAG3IG1GT 1850 

GGTOCAGIGT ATIGTTICAC CCCAAG00CT u n C H OS lC G G3A0CAQ0GA 1900 
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' ICGX ' X CCGCT UlUUL'UAOGfT G3AGAA3GAG ACAGAOGIGA 1950 

TOCTOCTCAA CAACAOGOGT OQQQCACAAG GCAACIQoIT OGGOTCfUACA 2000 

T3GATCAATA GIACIG33IT CACIAfiGACG TOCQ3AQ3IC (X COglG'lAA 2050 

CAT0GO0Q3G GTU33IAAOC GGAOCITGAT C3Q000CACG GACIOCTIOC 2100 

QGAAQCADOC CGAQ3CEftCT IfcCACAAAftT GIG3CICQ33 GOCXTIGSnG 2150 

ACAOcraosr goceagibga. cmaxaaac auulutiggc Acraooocro 2200 

CACICICAAT TlTia CSfflCr TIBAGGTEAG GAUUIMULG G333033IO3 2250 

AGCACAGQCT CAATOD03IA TOCAATiaSA. CTOGaGGAGA QCGCTOIAAC 2300 

TIQSAGGftCA AGAACICAGC ULUL ' IUL ' IUU TGICTBCAAC 2350 

AGAGIQXftG ATftCIQOOCT GflQCITICAC CBUUCliAOOG QCTTEATOCA 2400 

CTOG7ITIGAT CCAIUTCCAT CAGAACATCG TO3A03IQCA AraOCTOEAC 2450 

G3TDG.TAG37r Ott30GITIGT CTOL'JLTIUCA. ATCAAAT93G AGTftCATQCT 2500 

grroerrnrc c tic iq c iqg cagaogqqos osrororooc txtigigga 2550 

T3A3GCIQCT GATAOOOCAG GCIGAQGOOG UJi'UAGAGAA, CTiaGTOSIC 2600 

CTCAATOG33 OSTG03TOGC O33AG03CAT GGDOTCICT OCnTCITGT 2650 

GTICnCro C GOCX3CX7IQGT ACATEAAQ33 CAQ3CT33CT 000393035 2700 

CCTATOCTIT TTATOGCGTA. TQQ009CIGC TOCIGCIOCT ACIQQOSITA 2750 

OCAQCAGGAG CrEftO30CTT GGAC03GGAG ATO9CIGCAT O3IG03339G 2800 

TQOGSnCIT GEAGoICIGG TATICTTCAC CTIGTCAOCA TACTACAAAG 2850 

TSTTICIC RC TAG3CTCATA TGGIGSTTftC AATAdTEAT CADCAGAQOC 2900 

GAGQOGCACA TOCAAGIGIG GSICCOXCC CICAAOGTIC Q335ftQ3DGG 2950 

CGATQOCA1C ATCC'IOCTCA CGIGIQCGGT TCATCCAGAG TEAAT-LTi'lU 3000 

ACATCACCAA ACICCIGCIC GQCATACTOG GaJUUL'lVJAT Q3IQCTOCAG 3050 

GCIGQCA03A OGAGAGIG0C GEAC37ID3IG OG03CICAAG Q3CICATKX3 3100 

TQCATOCATC TIAGTQOGAA AAGTOGOOOG Q03ICATEAT GTOCAAAIQG 3150 

TCTICA3GAA QL'IUGOOQOG CIGACBGGIA OGTWLU1T1A TAACOTCIT 3200 

ACOOCACIOC G3GACTGGGC CCAO30Q33C CEAOGAGAOC TIUO GSIUJC 3250 

OGTEAGAQOCC GIOC3ICncr OO30CA3GGA. GAOCAftGGIC MCAOCT393 3300 

GAGCAGACAC OGCTOCGIGT QQGGACAJCA. TCTTOGGIUT ALXXXJIL 'IOC 3350 

G0C0GAAG93 QGAAGGAGAT ALL'l'I'i'lUGGA. 0333CIGATA GTCI03AAGG 3400 

GCAAQQCT33 OGft CTOCTIG O300CATCAC Q90CTACTOC CAAC7AA03C 3450 

GQQQOGIACr TGGTIG CAJC ATCACIMOC TCACAG330G GGACAAGAAC 3500 

CAGGTOGAAG GQGAGSTICA. AGTOGTTICT AQCX3ZAACAC AAICITTOCT 3550 

GQOGAOCTOC ATCAACQ30G 1GTOCIGSAC TGICEAGCAT G909CIGQCT 3600 

GGAAGAOOCT AGOCGSTOCA AAAGGTOCAA TCAOOCAAAT GD«ZAOCAAT 3650 

GTAGACCT33 AOCT0GTP0G S CTQ3CAG30G O00000339G O303CTOCAT 3700 

GACAOCATGC AGCIGTGQCA GCTO3GAGCT TBOTOSTC AOGAGACATC 3750 

CTGATGICAT TCXX3GT309C O0G0GAG90G ACAGCAG333 AAGICDOC 3800 
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1QCCCCM33C OCX3ICTOCIA OCTGAAAGGC 'lUL'imJG'IG G'lUJAIl'lULT 3850 

T1UUULTIU3 093303105 T333C33ICIT O0833CIGCT G1U1UGAOX 3900 

G3GG33T0GC GAAG3GQ3IG GACTICATAC COGTTGAGTC TMQGAAACr 3950 

ADCAIGCOSr CTOQQGIC1T CACAGACAAC TCAAO00CXE Q33CTSiaOC 4000 

GCAGACATIC CAAGIGXAC A TCT 3 GA 03C TOCIBCIGGC AQ333CAAGA 4050 

GCACCAAAGT Q0033CIQOS TAT3CAGOCC AAGGGTACAA GQIGCIOSIC 4100 

cigaacccgt uLunuuuuL ' cAacrraoaG ircoosaasr aomgiccaa 4i50 

G3CACAOOGT A303WXCTA ACAICAGAAC TO33CTAAQ3 AOCATEftOCA 4200 

OG9GGGGCIC CATEAOCTAC TOCADCITOG QCAAGTIGCT 1G3D3A033T 4250 

QQ CIGnCIG G330330CIA T3ACA3CA3A A2ATCIGAJG AGB3DCACIC 4300 

AACIGACICG ACTAOCATCT TGGGCAIOQG CACAGTOCIG GAOCAAG0O3 4350 

AGACQ3TIG3 AGOQ03XTC GICGIGCI03 OCACX33CTAC ACCIU0U33A 4400 

TCX33ITAQCG T3DCACACCC CAAIATCGAG GAAATAO30C 'lUlUJAACAA 4450 

TOGAGAGATC UULTIL ' JjA IG GCAAAGOCAT OCOCATIGAG GOCATCAAGG 4500 

Q3Q33AQ3CA ICICAUTI'IU TGCCATTCCA AGAAGAAAIG TCAOGAGCTC 4550 

QQOQCAAAGC T3ACAGGOCT OQGACIGAAC GCTGISAGZAT ATIA003333 4600 

OCTIGA3GTG TCCGICA1AC UUUL'imCG G AGADGTOSIT GICGIQQCAA 4650 

CAGACGCICT AAT3AOQ93T TICAO033O3 ATJ.T1 GACIC AGIGAIOGAC 4700 

T3CAATACAT CTGICAOQCA GACAGIDGAC TIXZAGCTIG3 ATOQCACCIT 4750 

CAOCATK3AG ACGAOSACOS TQOOOCAAGA CX30GGH3ICX3 COCTCQCAAC 4800 

Q3QGAGCTAG AACK33CAQG GGIAQGAGIG QGA1UIACAG GTnGIGACT 4850 

CCAQ3AGAAC G3QCCID333 CATCTTCGAT TCITOQSIOC 1GTSIGAGTG 4900 

CTATCACGOG GQLUU1GC1T GGTKIGAGCT CAO3CO03CT GAGACCB33G 4950 

TIAGGTIGGG G9CTTAOCTA AATACAOCAG GGTTGOCQ3T CIGOCAGGAC 5000 

CATCTOGAGT ' IC1 G G G A GAG CX3ICTICACA QGOCTCAOOC ACATAGAIGC 5050 

OCA LTll C' lG TOCCAGACTA AACAGGCAGG AGACAACTTT CCTEACCIGG 5100 

TGGCA3JOCA AGCTACAGIG TQ030CAGGG CICAAGCTOC ACCTCCATCG 5150 

TGGGAGCAAA TGIGGAAGIG TCICATACQG CTGAAAGCTA CAC1GCAOQG 5200 

GOCAACAOOC CIGCIGEKIA GGCTAGGAGC CGIDCAAAAT GAQ3ICA30C 5250 

TCACACAOOC CAIAACIAAA TACATCAIQS CA3GCA3GIC OSCIGAOCIG 5300 

GAGGianCA CraGCADCIG GGTOCIGGIA QGOGGAGIOC TIGCAULT1T 5350 

GGOCGCAIAC TOOCIGACGA CAGGCAGIGT GSICATIGIG GGCAGGATCA 5400 

' 1LT1G1CC Q 5 GAAQCCAGCT GTOSTTOOO G ACAGGGAAGT CUlLTftCCftG 5450 

GAGTTOGMG AGATO3AAGA G'lUlOX' X C A CAACTIOCTT ACATO3AGCA 5500 

GGGAATGCAG CTOQOCGAGC AATTCAAGCA AAAGGOQCIC Q33TIOTIGC 5550 

AAACOSOCAC CAAGCAAG03 GAGGCTGCIG CXCOOSIGCT QGAGIO CAAG 5600 

TOGQGAGOGC TTGAGACCTT CIGG30GAAG CACATOIGSA ATITCAICAG 5650 

CQGAATACAG TACCTAGCAG GCITATCCAC TCIQ0CIQGA AAOQ00QQGA 5700 
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03M3CKICATr GMTC33CATIT ACAULT1CJA TCAOmOOCC GCICACEAOC 5750 

CAAAACAOCC TOLMUITJAA CATCTTO333 GGAIG33I133 CT1330CAACT 5800 

G3C I CCTO0C AGOSCIGQST CAGCITIOST Q333333333 AIOGCGGGAG 5850 

CX3QCTCTTOG CAGCAaBGGC CTIGGGAAGG TOCIOSI03A CW1CT1UUUS 5900 

G3CEA3QQQ3 CAGGGGTAGC Q3333CMJIC GK330CITIA AQSiaOGAG 5950 

G3G0GAQ3IG OCJCTOCftOaS M3GKJCK33T CAACTEftCIC OCIGOCATOC 6000 

1CIUIDCT3G TOOO CT QSIC GTO3333TC3G '1UIGJ3CAC33 AA3ACIQ33T 6050 

033CA03IG3 GOOOGGGAGA GGGGGCIGIG CAGIGGAJGA AD0Q3CIGAT 6100 

AGOQiraQCT TOG03333m A03AC3TCIC OOCIfcOQCaC TftlGiUUL'lG 6150 

AGAQOGftCQC TGCAGCAOGT GTCACICAGA TOCTCICraG ULTIAGCATC 6200 

ACTCAACIQC TGAAGCQGCT C3CAC3CAGIQG AT3AAIGAG3 ACTGCICIAC 6250 

GaCATGCIOC GGCTCJGIGQC TAAG3GKP3T TIQ33ATIX33 AIATOCAG3G 6300 

T3CTGACIGA CTICAAGAQC TQ3C1CCAGT OCAAACIOCT GOOSCGGTIA 6350 

CCQGGAGTOC CTTTOCIGIC AT3CCAAD3C GQGIACAAGG GAGICIQ303 6400 

GGGGGAOQGC ATCA3GCAAA CX^OCIQOQC ATO0G3AGCA CAGAT0G0OG 6450 

GACATGTCAA AAAOGGTTOC AIGAGGATOS TAG3GCUEAG AACCIGCAGC 6500 

AACACGI03C A033AA03TT CCXDCftlCAAC GCAIACAOCA GGGGAOCTIG 6550 

CftCACOCTOC OOQGCQQCCA ACUOTOCAG GQOGCTMGG 0333IQGCIG 6600 

CIGAGGAGTA OGTOGAOGfTT AOGQU1U1U3 GQGA3TIOCA CEAOGIGAOG 6650 

GGCATCAOCA CIGACAACGT AAAGIGCOCA TQOCAGGTIC GGQ0O000GA 6700 

ATICITCADG GAGGTP3GA3G GAGIGOGGfTT GCACAG3EAC QCTOOGGOGT 6750 

QCAAAQCTUT TCTA033GAG GAQGTCAOGT T0CAG3IO33 GCTCAAOCAA 6800 

TACTIQGTQG G3TCGCAGCT CCCA.T3CX5AG O0QGAACOQ3 AGGIAACAGT 6850 

QCTTACTICC ATOCTCACOG ATGCCIOOCA CATIftCAGCA GAGACG3CIA 6900 

AOGCTAQQCT GGCTAGAGGG UCIQOCJOXT CTITROC3CAG CICATCAGCT 6950 

AGOCAGTIGT CTOOQOCTIC TTIGAAGQGG ACAIGCACTA CXXACCATCA 7000 

CICXX3CX3GAC GCT3ACCTCA TCGAGGOCAA OCICTIGIGG CGGCAGGAGA 7050 

T3330QGAAA CATCACTOGC GIGGAGICAG AGAA2AAGGT AGTAATIC'IG 7100 

' GACICITIC3S AAULULTILA CGOQGAGQGG GAIGAGAQGG AGATATOOGT 7150 

OG0GG0GG A G AT0CIG3GAA AATCCAGGAA GTICO0CTCA GOGTIGOOCA 7200 

TAK33QCAOG CXXX33ACTAC AATDCIOCAC T3CTAGAGIC CIQ3AAGGAC 7250 

CCGGACEAGG TCCCTOGGGT QGTACAGGGA T30QCATIGC CAGCTACCAA 7300 

GGCTOCTOCA ATAOCAOCTC CACGGAGAAA GAGGAOQGTT GTOGTGACAG 7350 

AATCCAA3GT GTCTICTGCC TIQGOQ GA GC TCGOCACTAA GAOGTTOG3T 7400 

AGCTOOGGAT OSIOGGOOGT TGA3AQ3G3C ACGOOGACCG CQCTIOCTGA 7450 

OCT GGO Cro C GA0GAO33T3 ACAAAGGATC OGAOGTIGAG T O GTmCTOCT 7500 

CCATG0000C CCTIGAAGGG GAQQ03333G ACCOOGATCT CAG0GAO333 7550 

TCTIGGICIA COGIGAGIGA GGAGQCTAGT GAQGATOTOG TCK3CT3CTC 7600 
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AATGTTCCEAT AOGIGGACAG QQ002CIGAT CA03CTroC QCTQCGGAG3 7650 

AAAGIAAGCT GOOCATCAAC OJbTlGAGCA ACICTTIGCT QOGTCAOCftC 7700 

AACAIG3ICT A09XACAAC ATCOCG2AQ2 GCAAGOCiCJC Q3CAGAAGAA 7750 

QGICAOCTTT GACAGATIQC AAGIOCIGGA TCATCAITAC 0333A0GTAC 7800 

TCAAG3AGAT GAAGSOGAAG GCOTOCACAG TmAGSCIAA QCT ICg ffl CT 7850 

AOaGAGGAGG (XTOCAAGCT GAOGOXOCA CATIOGGOCA AATOCAAATr 7900 

TOQCIMGQG GCAAAGGAOG TO333AACCT ATOCftQCRGG G OOSITO RCC 7950 

ACA3O0GC1C GGTGIG33AG GACTIGCIGG AAGACACIGA AACAOCBA3T 8000 

GACAOCAOCA TCA3G3CAAA. AAGTGAGGTT ITCIGOSIOC AAOCAGAGAA 8050 

GSGAG30CDC AAG0CAGCIC GOCTZKTOGT AT1UUUAGAC CIGGGAGTIC 8100 

GIGTA3GQGA GAAGA1Q30C CTTEACGAOG 1GGTCIOCAC (JLTIUL'ICAG 8150 

QOCTGATOG GCTOCTCA3A OGGATTICAA 1TOOOOOCA. AGCAGOGGGT 8200 

OGAGTIOCIG GriGAAEAOCT GGAAATCAAA GAAATOCDCT A3GGGCTICT 8250 

oomgacac coocrorrrr gacicaadqg tcacigagag TSAcanasr 8300 

GTIGAG3AGT CAATTiaOCA A1GTIGTGAC TIGGO0O00G AG3XAGACA 8350 

O30CATAAG3 TO3CICACAG AGOOGCTTCA CATO3333GT O00CIGACIA 8400 

ACICAAAAQS QCAGAACIGC GGTIMOQQC G3IG003OGC AAGIGGOSIG 8450 

CIGAOGACIA GCIGOGGTAA TAOOCICACA UUl'lftlLTlGA AGGOGftCIGC 8500 

AGOCIGTOGA GCTOCAAAGC TOCAG3ACIG CACGA3GCIC GIGAACGGAG 8550 

ACGAOCTIGT OGITATCIGr GAAAGQQ0G3 GAAOOCAG3A QGATOQ3QQS 8600 

QCCCTAQGAG CXTICAOQGA GGdMGACT AGGTATTOOG CQO0O0OOGG 8650 

GGA3OCGO0C CAAOCAGAAT AGGAOCIGGA GCIGATAACA TC AIGTTOCT 8700 

OCAAIGIGIC AGIGGCGCAC GA3GCATCIG QCAAAAGQGT AlACTOTIC 8750 

AOOOSIGAQC OCAOCACOX: CCTIGCAGGG UL'lUUb'lUJ^ AGACAGCTAG 8800 

ACACACBDCA ATCAACTCIT GGCIAGSCAA. T3flCA!ICA3G T3OGO30C3CA 8850 

OOCIATOGGC AAGGA3GATT CTGA3GACIC ACTTITICIC C30.LL.T1U1A 8900 

GCICAAGAQC AACTIGAAAA AGOOCIGGAT T3TCAGATCT ACGG3QCTTG 8950 

CTAC'iCCATT GAOGCAdTG ADCTAQCICA GATCATIGAA OGACICCATG 9000 

GIUTEAGGGC ATTTACACTC CACAGITACT CTOCAGGTGA GA3GAAXAGS 9050 

GIGGCnCAT GQCICAG3AA. ACTIGGGGIA OCACOCTIGC GAACCIGGAG 9100 

ACAT0QQGOC AGAAGIGIOC GOGCXAAGCT AC I GIOGC A G GQQGGGAGGG 9150 

C3QQOCACTIG TQGCAGATAC CICTTEAACT QQQCAG3AAG GAOCAAQCTT 9200 

AAACICACIC CAATQ00GGC GQOGIOOCAG CTGGACTIGT CIGUL'lUjlT 9250 

OGIDQCIGGrT TACAQOQGGG GAGACATA3A TCACAQDC7IG TCIOSIGQ O C 9300 

GAOCXXpQCIG GTnGOGTIG TGCCTACTOC TACTTICIGT AG3GGTAGGC 9350 

ATTTACCIQC TC0CCAACO3 ATCAAQGG33 AGCTAAOCAC TOSAGOOCIT 9400 

AAQQCATnC CiGTl'lTl'lT TnTTTTnT Tl ' l'l ' l ' lTl ' lT ICnTTTnT 9450 

Tncmocr TTOcnciTr Trrrocrnc TrrnoDcrr citiaatqgt 9500 
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Q3CIOCA3CT TAOUUL'JJAGT CALUJL'ilAOC 1GIGAAM3GT COGIGAGOCG 9550 
CATCftCIGCA GftGAGIGCIG A23CTQ30CT CICTQCAGftT CATOT 9595 

FIG. 7F 
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MS3NPKPQRK TKRNINRRFQ PJKFFQC33QI W3GVYLLPRR GPKLGVRA3R 50 

KASERSQERG RRQPIPKARR PH3RAWAQPS YPWPLYGMEB IGWAGWLISP 100 

FGSRPSWGPT DPPRRSHND3 KVinnJIOGF AIXM3YIPLV GAPH33AAFA 150 

LAH3VRVLED GVNYAIGNLP GCSFSXFLLA U5CLTIPAS AYEVRNVSGI 200 

YHVU^DCSNS SIVyEAAEWI MHi'POlVPCV QEENSSRCWV AIITFI1AARN 250 

ASVPITITRR HVOLIM3TAA PCSAMtfVKSX CGSIFLiVSQL FEFSPRRHET 300 

VQDCNCSIYP OiVSaJFMAW EM-MNWSPTT ALW9QLLRI FQAWEMUAG 350 

AHW3VLAGLA YYSMVGNWAK VLIVALLF^G VDGEIHnGR VAGHITSGSFT 400 

SLFS9GASQK IQLVNIN3SW HINRTALNCN DSLQIGFKAA IJFYAHKFNSS 450 

GCPERMA9CR PIEWFAQGWG PTIYIKPNSS DQRPYCWHYA PRPOSWPAS 500 

CWOGPVYCFT PSPWVGTID R93VPTYSWG ENE3TWMLLN NIREPQC2<WF 550 

GCIVWNSIGF 1KTCGGPPCN IGGVQ^RIU CPTDCFRKHP EAIYIKDG93 600 

PWLTPRCLVD YPYFLWHYPC TLNFSIEKVR MYVGGVEHRL NAACNWIK3B 650 

IOCECRERS ELSPUIglT EWQEUPCAFT TLPALSIGLI KLH3NIVEM2 700 

YL.YGVGSAFV SFAIKWEYIL LL£12J_ADAR VCACXW44LL IAQAEAALEN 750 

LWLNAASVA GAH3ILSFL.V FFCAAWtflKG RLAFGAAYAF YGVWPT.T.T J J, 800 

LALPPRAYAL EREMAASCGG AVLVGLVFLT LSPYYKVFLT RLJVMLQYFI 850 

TRAEAHM3VW VPPI2SIVRG3R DA1XLLTCAV HPFTil hUL'lK T.T.TAILGPIrt 900 

VLQAGTIKVP YFVPAQGLIR ACMLVRKVAG GHXV^IVFMK LGALTOIWY 950 

HAGLRHAVA. VEPWFSAME IKVnWGADT AACGDIIlJGL 1000 

PVSARRGKEI FLGPADSLEG QQWRLLAPTT AY9QQ1RGVL GCHTSCIGR 1050 

EKNQVEGEVQ WSTATQSFL ATCINGVCWT VYH3AGSKTL AGPKSPTIQM 1100 

YINVDLELVG WQAPFGARSM TPC9Q3SSDL YLVTRHAWI PVFRRGDSRG 1150 

SLLSPRPVSY LKGS9QGELL CPSGHWGVF FAAVCIRGVA KAVDFIPVES 1200 

METIMRSPVF TCNSTPPAVP QIFQWAHLHA PIG9GKSIKV PAAYAAQGYK 1250 

VLVLNPSVAA TU3FGAYMSK AH3IDPNIRT GVR3TTIQGS ITbfSTYGKFL 1300 

ADC3QC9QGAY DIIICDBCHS TDtflTlLGIG TVLDQAEIAG AKLWIA3AT 1350 

PPGSVIVPHP NIEEIGLSNN GKEFFYGKAI PIEAIK33RH UTCHSKKKC 1400 

DELAAKUIGL GLNAVAYVRG LEMSVIPPIG Dv/VWAIDAL MIGFK33TDS 1450 

VUCNICVIQ TVEFSLDPTF TJLE1T1VFQP AVSRSQRRGR TGRQR9GIYR 1500 

FVTPGERP9G MFDSSVLCBC YDAQCAWYEL TPAEISVFXR AYLNTPGLPV 1550 

OQCHLEFWES VFIGLTHTDA HFLSQfTKQAG ENFPYLVAYQ ATVCARAQAP 1600 

PPSWD3*JKC LERLKPTLH3 FIPLLYBLGA VgNEVHJIHP ITKYIMACMS 1650 

ADLEWTSIW VLU3GVLAAL AAYCLTIGSV V3V3RIILSG KPAWPEREV 1700 

LYQEFDEMEE CASQLPYIEQ O^LAEQFKQ KALGLLQIAT KQAEAAAPW 1750 

ESKWRALEIF WAKHMWNFIS GIQYLAGLST LPGNPAIASL MAF3ASITSP 1800 

LTTQNILLFN HGGWVAAQL APPSAASAFV GAGIAGAAWG SIGLGKVLVD 1850 

ILAGYGAGVA. GALVAFKVMS GEVPSIEDLV NLLPATT SFG ALVVGVVCAA 1900 
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10 20 30 40 50 

1234567890 1234567R90 1234567890 1234567890 123456 7890 

GAVQKMNRLI AEAJSRGNH7S PIHWPESDft. AARVIQILSS 1950 

LTTiyULKRL H^HNELCST PC9SSWLKD/ WDWICIVL1D FKIWLQSKLL 2000 

FRLFGVEFLS 0QRGYKJ3VWR GDGZM3TICP OGRQIAGHVK NGSMRJVKER 2050 

TCSNIWH3IF PINftiTIGPC TTSPAHWSR ALWRUAAEEtf VETIKV^DPH 2100 

WIt34riEN7 KCPO^PAPE FFIEVDGVRL HRYAEACKPL LR&MLULWG 2150 

PCEFEEWIV I/IS4IJII2PSH 1'IAb'JUAKRRL ARGSPPSLAS 2200 

SSASQLSAPS LKA1CTIHHD SPTftDLIEM* LIMRQEM9GN HKVESENKV 2250 

VTLDSFEPLH AB3ZBEESV AAETLRKSRK FPSALPIWftR PEHNPEUBS 2300 

WKDPEKVPPV VH3CELPFEK APPJPEPKRK R3WE/IESNV SSMAEIA3K 2350 

TFGSSGSSAV IASHX2KGS UJESXSSMPP 1 SSEGDBDL 2400 

SDGSWSIV5E EASECWOCS MSYTWTCAIiI TFCAAEESKL RENPLSNSLL 2450 

SRSASLUQKK VitLKLQ^LD UttFEVLKEM KAKASTVKAK 2500 

T.T.STFFACKL TPPHSAKSKF GYGftKEVFNL SSRAVNHIRS VWEEUJEDIE 2550 

TPIDlTJLMftK SEWFCVQPEK GGRKPARLIV FHXG7RVCE KMftLHDWST 2600 

LiPQAVMaSSY GFgttSPKQRV EFLVNIWKSK KCPM3FSXDT RCFDSIVIES 2650 

DIKVEESIYQ OCDLAPEARQ A2BSL1ERLY IQGPUINSKG (J^OGYRRCRA 2700 

SGVLTTSOaJ TUIXTYIKAIA ACRAAKLQCC TMLVMXDLV VICESAGIQE 2750 

EftAALRAFIE AMUQfSAPKS DEPQPEW2LE UT9CSSNVS 2800 

YYHTFDPTTP IARAAWBIAR HTPINSWLjGN IIMMlim PMTLMEHFFS 2850 

ILLAQD3LEK ALDOQIYGAC YSIEPLDLPQ ILERLH3LSA. FTLHSYSPGE 2900 

INKVASCURK LGVPFLKIVJR HRARSVRAKL L9QG3RAATC GTOdTNWWR 2950 

TKLKLTPIPA ASQUXSGWF VftGTSGGDIY HSLSRARPBW FPLCUJLiSV 3000 

GM3IYULPNR 3010 

FIG. 7H 



SUBSTITUTE SHEET (RULE 26) 



WO 00/75338 



PCTAJS00/15446 



SEQUENCE LISTING 

<110> Yanagi, Masayuki 
Emerson , Suzanne 
Bukh, Jens 
Purcell, Robert 

<12 0> Cloned Genome of Infectious Hepatitis C Viruses of 
Genotype 2a and Uses Thereof 

<130> 20264302PC 

<140> TBA 

<141> 2000-06-02 

<150> 60/137,693 
<151> 1999-06-04 

<160> 39 

<170> Patent In Ver. 2.1 

<210> 1 
<211> 9711 
<212> DNA 

<213> Hepatitis C virus 
<400> 1 

acccgcccct aataggggcg acactccgcc atgaatcact cccctgtgag gaactactgt 60 
cttcacgcag aaagcgtcta gccatggcgt tagtatgagt gtcgtacagc ctccaggccc 12 0 
ccccctcccg ggagagccat agtggtctgc ggaaccggtg agtacaccgg aattgccggg 180 
aagactgggt cctttcttgg ataaacccac tctatgcccg gccatttggg cgtgcccccg 240 
caagactgct agccgagtag cgttgggttg cgaaaggcct tgtggtactg cctgataggg 300 
tgcttgcgag tgccccggga ggtctcgtag accgtgcacc atgagcacaa atcctaaacc 360 
tcaaagaaaa accaaaagaa acaccaaccg tcgcccacaa gacgttaagt ttccgggcgg 420 
cggccaga-tc gttggcggag tatacttgtt gccgcgcagg ggccccaggt tgggtgtgcg 4 80 
cgcgacaagg aagacttcgg agcggtccca gccacgtgga aggcgccagc ccatccctaa 540 
agatcggcgc tccactggca aatcctgggg aaaaccagga tacccctggc ccctatacgg 600 
gaatgaggga ctcggctggg caggatggct cctgtccccc cgaggttccc gtccctcttg 660 
gggccccaat gacccccggc ataggtcgcg caacgtgggt aaggtcatcg ataccctaac 720 
gtgcggcttt gccgacctca tggggtacat ccctgtcgtg ggcgccccgc tcggcggcgt 780 
cgccagagct ctcgcgcatg gcgtgagagt cctggaggac ggggttaatt ttgcaacagg 840 
gaacttaccc ggttgctcct tttctatctt cttgctggcc ctgctgtcct gcatcaccac 900 
cccggtctcc gctgccgaag tgaagaacat cagtaccggc tacatggtga ctaacgactg 960 
caccaatgac agcattacct ggcagctcca ggctgctgtc ctccacgtcc ccgggtgcgt 102 0 
cccgtgcgag aaagtgggga atgcatctca gtgctggata ccggtctcac cgaatgtggc 1080 
cgtgcagcgg cccggcgccc tcacgcaggg cttgcggacg cacatcgaca tggttgtgat 114 0 
gtccgccacg ctctgctctg ccctctacgt gggggacctc tgcggtgggg tgatgctcgc 1200 



l 



WO 00/75338 



PCT/US00/15446 



agcccaaatg ttcattgtct cgccgcagca ccactggttt gtccaagact gcaattgctc 1260 
catctaccct ggtaccatca ctggacaccg catggcatgg gacatgatga tgaactggtc 132 0 
gcccacggct accatgatct tggcgtacgc gatgcgtgtc cccgaggtca ttatagacat 1360 
cattagcggg gctcattggg gcgtcatgtt cggcttggcc tacttctcta tgcagggagc 144 0 
gtgggcgaaa gtcgttgtca tccttctgtt ggccgccggg gtggacgcgc gcacccatac 1500 
tgttgggggt tctgccgcgc agaccaccgg gcgcctcacc agcttatttg acatgggccc 1560 
caggcagaaa atccagctcg ttaacaccaa tggcagctgg cacatcaacc gcaccgccct 162 0 
gaactgcaat gactccttgc acaccggctt tatcgcgtct ctgttctaca cccacagctt 1680 
caactcgtca ggatgtcccg aacgcatgtc cgcctgccgc agtatcgagg ccttccgggt 174 0 
gggatggggc gccttgcaat atgaggataa tgtcaecaat ccagaggata tgagacccta 1800 
ttgctggcac tacccaccaa ggcagtgtgg cgtggtctcc gcgaagactg tgtgtggccc 1860 
agtgtactgt ttcaccccca gcccagtggt agtgggcacg accgacaggc ttggagcgcc 192 0 
cacttacacg tggggggaga atgagacaga tgtcttccta ttgaacagca ctcgaccacc 198 0 
gctggggtca tggttcggct gcacgtggat gaactcttct ggctacacca agacttgcgg 2 04 0 
cgcaccaccc tgccgtacta gagctgactt caacgccagc acggacctgt tgtgccccac 2100 
ggactgtttt aggaagcatc ctgataccac ttacctcaaa tgcggctctg ggccctggct 2160 
cacgccaagg tgcctgatcg actaccccta caggctctgg cattacccct gcacagttaa 222 0 
ctataccatc ttcaaaataa ggatgtatgt gggaggggtt gagcacaggc tcacggctgc 22 80 
atgcaatttc actcgtgggg atcgttgcaa cttggaggac agagacagaa gtcaactgtc 234 0 
tcctttgttg cactccacca cggaatgggc cattttacct tgctcttact cggacctgcc 2400 
cgccttgtcg actggtcttc tccacctcca ccaaaacatc gtggacgtac aattcatgta 2460 
tggcctatca cctgccctca caaaatacat cgtccgatgg gagtgggtaa tactcttatt 252 0 
cctgctctta gcggacgcca gggtttgcgc ctgcttatgg atgctcatct tgttgggcca 2580 
ggccgaagca gcactagaga agctggtcat cttgcacgct gcgagcgcag ctagctgcaa 2640 
tggcttccta tattttgtca tctttttcgt ggctgcttgg tacatcaagg gtcgggtagt 2700 
ccccttagct acctattccc tcactggcct gtggtccttt agcctactgc tcctagcatt 2760 
gccccaacag gcttatgctt atgacgcatc tgtgcatggc cagataggag cggctctgct 2820 
ggtaatgatc actctcttta ctctcacccc cgggtataag acccttctca gccggttttt 2880 
gtggtggttg tgctatcttc tgaccctggg ggaagctatg gtccaggagt gggcaccacc 2 94 0 
tatgcaggtg cgcggtggcc gtgatggcat catatgggcc gtcgccatat tctacccagg 3000 
tgtggtgttt gacataacca agtggctctt ggcggtgctt gggcctgctt acctcctaaa 3060 
aggtgctttg acgcgcgtgc cgtactfccgt cagggctcac gctctactga ggatgtgcac 312 0 
catggcaagg catctcgcgg ggggcaggta cgtccagatg gcgctactag cccttggcag 3180 
gtggactggc acttacatct atgaccacct cacccctatg tcggattggg ctgctagtgg 324 0 
cctgcgggac ctggcggtcg ccgttgagcc tatcatcttc agtccgatgg agaagaaagt 3300 
cattgtctgg ggagcggaga cagctgcttg tggggacatt ttacacggac ttcccgtgtc 3 3 60 
cgcccgactt ggtcgggagg tcctccttgg cccagctgat ggctatacct ccaaggggtg 3420 
gagtcttctc gcccccatca ctgcttacgc ccagcagaca cgtggccttt tgggcaccat 3480 
agtggtgagc atgacggggc gcgacaagac agaacaggct ggggaaattc aggtcctgtc 354 0 
cacagtcact cagtccttcc tcggaacatc catctcgggg gttttgtgga ctgtctacca 3600 
tggagctggc aacaagactc tggccggctc acggggtccg gtcacgcaga tgtactccag 3660 
tgctgagggg gacttagtag ggtggcccag cccccctggg actaaatctt tggagccgtg 372 0 
cacgtgtgga gcggtcgacc tgtacctggt cacgcggaac gctgatgtca tcccggctcg 3780 
aagacgcggg gacaaacggg gagcgctact ctccccgaga cctctttcca ccttgaaggg 384 0 
gtcctcagga ggcccggtgc tatgccccag gggccacgct gtcggagtct tccgggcagc 3900 
tgtgtgctct cggggcgtgg ctaagtccat agatttcatc cccgttgaga cactcgacat 3960 
cgtcacgcgg tcccccacct ttagtgacaa cagcacacca cctgctgtgc cccagaccta 4020 
tcaggtcggg tacttgcatg ccccgactgg cagtggaaag agcaccaaag ttcctgtcgc 4 080 
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atatgctgct caggggtata aagtgctagt 
gtttggggcg tacttgtcta aggcacatgg 
gactgtgacg accggggcgc ccatcacgta 
gggctgtgcg ggcggcgcct acgacatcat 
taccaccatc cttggcatcg gaacagtcct 
aactgtgctg gctacagcta cgccccctgg 
ggaggtggcc cttgggcagg agggcgagat 
ttacatcaag ggaggaagac atctgatctt 
cgcggcggcc cttcggggta tgggcttgaa 
ctccgtaata ccaactcagg gagacgtagt 
gtatactggg gactttgact ccgtgatcga 
cttcagttta gaccccacat tcaccataac 
acgtagccag cgccggggtc gcacgggtag 
cactggtgag cgagcctcag gaatgtttga 
aggggccgca tggtatgagc tcacaccatc 
caacacgccc ggtttgcctg tgtgccaaga 
cggcctcaca cacatagatg cccacttcct 
cgcatactta acagcctacc aggctacagt 
ctgggacgtc atgtggaagt gtttgactcg 
tctcctgtac cgcttgggct ctgttaccaa 
atacatcgcc acctgcatgc aagccgacct 
agggggagtc ttggcggccg tcgccgcgta 
cggccgcttg cacattaacc agcgagccgt 
ggcttttgat gagatggagg aatgtgcctc 
gatagccgag atgctgaagt ccaagatcca 
tcaagacata caacccactg tgcaggcttc 
acacatgtgg aacttcatta gcggcatcca 
gaaccctgca gtagcttcca tgatggcgtt 
aagcaccact atccttctca acattttggg 
cgcgggggcc actggcttcg ttgtcagtgg 
cttaggtaag gtgctagtgg acatcctggc 
cgtcgcattc aagatcatgt ctggcgagaa 
gcctggaatt ctgtctccgg gtgccttggt 
ccgacacgtg ggaccggggg aaggcgccgt 
ttccagagga aatcacgtcg cccccaccca 
tgtgacccaa ctacttggct cccttaccat 
gattactgag gactgcccca tcccatgcgg 
ggtttgcacc atcctaacag actttaaaaa 
gcccggcctc ccctttgtct cctgtcaaaa 
catcatgacc acacggtgtc cttgcggcgc 
catgagaatc acggggccta agacctgcat 
ttgttacacg gagggccagt gcgtgccgaa 
gagggtggcg gcctcagagt acgcggaggt 
aggactcacc actgataact tgaaagtccc 
ctgggtggac ggagtgcaga tccataggtt 
tgaggtctcg ttctgcgttg ggcttaattc 
ccctgaaccc gacacagacg tattgatgtc 
ggagactgca gcgcggcgtt tagcgcgggg 



gcttaatccc tcagtggctg ccaccctggg 4140 
catcaatccc aacattagga ctggagtcag 42 00 
ctccacatat ggcaaattcc tcgccgatgg 42 60 
catatgtgat gaatgccatg ccgtggactc 4 32 0 
tgatcaagca gagacagctg gggtcagact 4 380 
gtcagtgaca accccccacc ccaacataga 4440 
ccccttctat gggagggcga ttcccctgtc 4500 
ctgccattca aagaaaaagt gtgacgagct 4560 
ctcagtggca tactacagag ggttggacgt 4620 
ggtcgtcgcc accgacgccc tcatgacagg 4680 
ctgcaacgta gcggtcactc aagttgtaga 4740 
cacacagatt gtccctcaag acgctgtctc 4 8 00 
gggaagactg ggcatttata ggtatgtttc 4 860 
cagtgtagtg ctctgtgagt gctacgacgc 4 920 
ggagaccacc gtcaggctca gggcgtattt 4 980 
ccatcttgag ttttgggagg cagttttcac 5040 
ttcccaaaca aagcaatcgg gggaaaattt 5100 
gtgcgctagg gccaaagccc cccccccgtc 5160 
actcaagccc acactcgtgg gccccacacc 5220 
cgaggtcacc ctcacacatc ccgtgacgaa 5280 
fc g a gg tcat g accagcacat gggtctcggc 5340 
ttgcctggcg accgggtgtg tttgcatcat 5400 
cgttgcgccg gacaaggagg tcctctatga 5460 
tagggcggct ctcattgaag aggggcagcg 5520 
aggcttattg cagcaagctt ccaaacaagc 5580 
atggcccaag gtagaacaat tctgggccaa 564 0 
atacctcgca ggactatcaa cactgccagg 5700 
cagtgccgcc. ctcaccagtc cgctgtcaac 5760 
gggctggcta gcatcccaaa ttgcaccacc 5820 
cctagtggga gctgccgtag gcagtatagg 5 880 
agggtatggt gcgggcattt cgggggctct 5940 
gccctccatg gaggatgtcg tcaacttgct 6000 
agtgggagtc atctgcgcgg ccattctgcg 6060 
ccaatggatg aatagactca ttgcctttgc 6120 
ctacgtgacg gagtcggatg cgtcgcagcg 6180 
aaccagcctg ctcagaagac tccacaactg 6240 
cggctcgtgg ctccgcgatg tgtgggactg 6300 
ttggctgacc tccaaattat tcccaaagat 6360 
ggggtacaag ggcgtgtggg ccggcactgg 642 0 
caatatctct ggcaatgtcc gcttgggctc 6480 
gaatatctgg caggggacct ttcctatcaa 6540 
acccgcgcca aactttaagg tcgccatctg 6600 
gacgcagcac gggtcatacc actacataac 6660 
ctgccaacta ccctctcccg agttcttttc 672 0 
tgcccccaca ccgaagccgt ttttccggga 6780 
atttgtcgtc gggtcccagc ttccttgcga 684 0 
catgctaaca gatccatctc atatcacggc 6900 
gtcaccccca tccgaggcaa gctcctcggc 6960 
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gagccagcta tcggcaccat cgctgcgagc cacctgcacc acccacggca aagcctatga 7020 

tgtggacatg gtggatgcta acctgttcat ggggggcgat gtgactcgga tagagtctgg 7080 

gtccaaagtg gtcgttctgg actctctcga cccaatggtc gaagaaagga gcgaccttga 7140 

gccttcgata ccatcagaat acatgctccc caagaagagg ttcccaccag ctttaccggc 7200 

ctgggcacgg cctgattaca acccaccgct tgtggaatcg tggaaaaggc cagattacca 7260 

accggccact gttgcgggct gtgctctccc tcctcctagg aaaaccccga cgcctccccc 7320 

aaggaggcgc cggacagtgg gcctaagtga ggactccata ggagatgccc ttcaacagct 7380 

ggccattaag tcctttggcc agcccccccc aagcggcgat tcaggccttt ccacgggggc 7440 

gggcgctgcc gattccggca gtcagacgcc tcctgatgag ttggcccttt cggagacagg 7500 

ttccatctct tccatgcccc ccctcgaggg ggagcttgga gatccagacc tggagcctga 7560 

gcaggtagag ccccaacccc ccccccaggg gggggtggca gctcccggct cggactcggg 7620 

gtcctggtct acttgctccg aggaggacga ctccgtcgtg tgctgctcca tgtcatactc 7680 

ctggaccggg gctctaataa ctccttgtag tcccgaagag gagaagttac cgattaaccc 7740 

cttgagcaac tccctgttgc gatatcacaa caaggtgtac tgtaccacaa caaagagcgc 7800 

ctcactaagg gctaaaaagg taacttttga taggatgcaa gtgctcgact cctactacga 7860 

ctcagtctta aaggacatta agctagcggc ctccaaggtc accgcaaggc tcctcaccat 7920 

ggaggaggct tgccagttaa ccccacccca ttctgcaaga tctaaatatg ggtttggggc 7980 

taaggaggtc cgcagcttgt ccgggagggc cgttaaccac atcaagtccg tgtggaagga 804 0 

cctcctggag gactcagaaa caccaattcc cacaaccatt atggccaaaa atgaggtgtt 8100 

ctgcgtggac cccaccaagg ggggcaagaa agcagctcgc cttatcgttt accctgacct 8160 

cggcgtcagg gtctgcgaga agatggccct ttatgacatt acacaaaaac ttcctcaggc 8220 

ggtgatgggg gcttcttatg gattccagta ttcccccgct cagcgggtag agtttctctt 8280 

gaaagcatgg gcggaaaaga aggaccctat gggtttttcg tatgataccc gatgctttga 834 0 

ctcaaccgtc actgagagag acatcaggac tgaggagtcc atatatcggg cctgctcctt 84 00 

gcccgaggag gcccacactg ccatacactc gctaactgag agactttacg tgggagggcc 8460 

tatgttcaac agcaagggcc aaacctgcgg gtacaggcgt tgccgcgcca gcggggtgct 852 0 

caccactagc atggggaaca ccatcacatg ctacgtgaaa gccttagcgg cttgtaaagc 8580 

tgcagggata atcgcgccca caatgctggt atgcggcgat gacttggttg tcatctcaga 864 0 

aagccagggg accgaggagg acgagcggaa cctgagagcc ttcacggagg ctatgaccag 8700 

gtattctgcc cctcctggtg acccccccag accggagtat gatctggagc tgataacatc 8760 

ttgctcctca aatgtgtctg tggcgctggg cccacaaggc cgccgcagat actacctgac 8820 

cagagaccct accactccaa tcgcccgggc tgcctgggaa acagttagac actcccctgt 8880 

caattcatgg ctgggaaaca tcatccagta cgccccgacc atatgggctc gcatggtcct -8940 

gatgacacac ttcttctcca ttctcatggc tcaagacacg ctggaccaga acctcaactt 9000 

tgagatgtac ggagcggtgt actccgtgag tcccttggac ctcccagcta taattgaaag 9060 

gttacatggg cttgacgctt tttctctgca cacatacact ccccacgaac tgacacgggt 9120 

ggcttcagcc ctcagaaaac ttggggcgcc acccctcaga gcgtggaaga gccgggcacg 9180 

tgcagtcagg gcgtccctca tctcccgtgg ggggagagcg gccgtfctgcg gtcgatatct 9240 

cttcaattgg gcggtgaaga ccaagctcaa actcactcca ttgccggaag cgcgcctcct 9300 

ggatttatcc agctggttca ccgtcggcgc cggcgggggc gacatttatc acagcgtgtc 9360 

gcgtgcccga ccccgcttat tgctctttgg cctactccta ctttttgtag gggtaggcct 9420 

tttcctactc cccgctcggt agagcggcac acattagcta cactccatag ctaactgtcc 9480 

cttttttttt tttttttttt tttttttttt tttttttttt tttttttttt tttttttttt 9540 

tttttttttt tttttttttt tttttctttt tttctctttt ccttctttct taccttattt 9600 

tactttcttt cctggtggct ccatcttagc cctagtcacg gctagctgtg aaaggtccgt 9660 

gagccgcatg actgcagaga gtgccgtaac tggtctctct gcagatcatg t 9711 
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<210> 2 
<211> 3033 
<212> PRT 

<213> Hepatitis C virus 
<400> 2 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 60 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys He Thr Thr Pro val ser Ala Ala 
180 185 190 

Glu Val Lys Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 
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Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp lie 
225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His lie Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 285 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met He Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val He He Asp He He Ser Gly Ala His 
340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val He Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys He Gin Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe He Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser He Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 
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Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp val Phe Leu Leu Asn Ser Thr 
530 535 540 



Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Abp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
595 600 605 

Pro Arg Cys Leu lie Abp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
610 615 620 

Thr Val Asn Tyr Thr lie Phe Lys lie Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 



Thr Thr Glu Trp Ala lie Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp Val Gin 
690 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr lie Val Arg Trp 
705 710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 
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Ala Cys Leu Trp Met Leu lie Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Lys Leu Val lie Leu His Ala Ala Ser Ala Ala Ser Cys Asn Gly 
755 760 765 

Phe Leu Tyr Phe Val lie Phe Phe Val Ala Ala Trp Tyr lie Lys Gly 
770 775 780 

Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr Gly Leu Trp Ser Phe 
785 730 795 800 

Ser Leu Leu Leu Leu Ala Leu Pro Gin Gin Ala Tyr Ala Tyr Asp Ala 
805 810 815 

Ser Val HIb Gly Gin lie Gly Ala Ala Leu Leu Val Met He Thr Leu 
820 825 830 

Phe Thr Leu Thr Pro Gly Tyr Lys Thr Leu Leu Ser Arg Phe Leu Trp 
835 840 845 

Trp Leu Cys Tyr Leu Leu Thr Leu Gly Glu Ala Met Val Gin Glu Trp 
850 855 860 

Ala Pro Pro Met Gin Val Arg Gly Gly Arg Asp Gly He He Trp Ala 
865 870 875 880 

Val Ala He Phe Tyr Pro Gly Val Val Phe Asp He Thr Lys Trp Leu 
885 890 895 

Leu Ala Val Leu Gly Pro Ala Tyr Leu Leu Lys Gly Ala Leu Thr Arg 
900 905 910 

Val Pro Tyr Phe Val Arg Ala His Ala Leu Leu Arg Met Cys Thr Met 
915 920 925 

Ala Arg His Leu Ala Gly Gly Arg Tyr Val Gin Met Ala Leu Leu Ala 
930 935 940 

Leu Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Thr Pro Met 
945 950 955 960 

Ser Asp Trp Ala Ala Ser Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro He He Phe Ser Pro Met Glu Lys Lys val He val Trp Gly Ala 
980 985 990 
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Glu Thr Ala Ala Cys Gly Asp He Leu His Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp Gly Tyr Thr Ser 
1010 1015 1020 

Lys Gly Trp Ser Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Thr He Val Val Ser Met Thr Gly Arg Asp Lys 
1045 1050 1055 

Thr Glu Gin Ala Gly Glu He Gin val Leu Ser Thr Val Thr Gin Ser 
1060 1065 1070 

Phe Leu Gly Thr Ser He Ser Gly Val Leu Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Asn Lys Thr Leu Ala Gly Ser Arg Gly Pro Val Thr Gin Met 
1090 1095 1100 

Tyr Ser Ser Ala Glu Gly Asp Leu Val Gly Trp Pro Ser Pro Pro Gly 
1105 1110 1115 1120 

Thr Lys Ser Leu Glu Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu 
1125 1130 1135 

Val Thr Arg Asn Ala Asp Val He Pro Ala Arg Arg Arg Gly Asp Lys 
1140 1145 1150 

Arg Gly Ala Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser 
1155 1160 1165 

Ser Gly Gly Pro Val Leu Cys Pro Arg Gly His Ala Val Gly Val Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Ser Arg Gly Val Ala Lys Ser He Asp Phe He 
1185 1190 1195 1200 

Pro Val Glu Thr Leu Asp He Val Thr Arg Ser Pro Thr Phe Ser Asp 
1205 1210 1215 

Asn Ser Thr Pro Pro Ala Val Pro Gin Thr Tyr Gin Val Gly Tyr Leu 
1220 1225 1230^ 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Val Ala Tyr 
1235 1240 1245 
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Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala 
1250 125S 1260 

Thr Leu Gly Phe Gly Ala Tyr Leu Ser Lys Ala His Gly lie Asn Pro 
1265 1270 1275 1280 

Asn-lle Arg Thr Gly Val Arg Thr Val Thr Thr Gly Ala Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ala Gly Gly 
1300 1305 1310 

Ala Tyr Asp lie He He Cys Asp Glu Cys His Ala Val Asp Ser Thr 
1315 1320 1325 

Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Val Arg Leu Thr Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Thr Pro His Pro Asn He Glu Glu Val Ala Leu Gly Gin Glu Gly Glu 
1365 1370 1375 

He Pro Phe Tyr Gly Arg Ala He Pro Leu Ser Tyr He Lys Gly Gly 
1380 1385 1390 

Arg His Leu He Phe Cye His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Ala Leu Arg Gly Met Gly Leu Asn Ser Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 
1425 1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Val Ala Val Thr Gin Val Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Thr Thr Gin He Val Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu Gly He Tyr Arg 
1490 1495 1500 
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Tyr Val Ser Thr Gly Glu Arg Ala Ser Gly Met Phe Asp Ser Val Val 
1505 1510 1515 1520 

Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ser Glu Thr Thr val Arg Leu Arg Ala Tyr Phe Asn Thr Pro Gly Leu 
1540 1545 1550 

Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ala Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Ala Tyr Leu Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1585 1590 1595 1600 

Ala Lye Ala Pro Pro Pro Ser Trp Asp Val Met Trp Lys Cys Leu Thr 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu Val Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
1620 1625 1630 

Gly Ser Val Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr 
1635 1640 1645 

He Ala Thr Cys Met Gin Ala Asp Leu Glu Val Met Thr Ser Thr Trp 
1650 1655 1660 

Val Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 
1665 1670 1675 1680 

Thr Gly cys Val Cys He He oly Arg Leu His He Asn Gin Arg Ala 
1685 1690 1695 

Val Val Ala Pro Asp Lys Glu Val Leu Tyr Glu Ala Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ala Ser Arg Ala Ala Leu He Glu Glu Gly Gin Arg He 
1715 1720 1725 

Ala Glu Met Leu Lys Ser Lys He Gin Gly Leu Leu Gin Gin Ala Ser 
1730 1735 1740 

Lys Gin Ala Gin Asp He Gin Pro Thr Val Gin Ala Ser Trp Pro Lys 
1745 1750 1755 1760 



11 



WO 00/75338 



PCT/US00/15446 



Val Glu Gin Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala Val Ala 
1780 1785 1790 

Ser Met Met Ala Phe Ser Ala Ala Leu Thr Ser Pro Leu ser Thr Ser 
1795 1800 1805 

Thr Thr He Leu Leu Asn He Leu Gly Gly Trp Leu Ala Ser Gin He 
1810 1815 1820 

Ala Pro Pro Ala Gly Ala Thr Gly Phe Val Val Ser Gly Leu Val Gly 
1825 1830 1835 1840 

Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He 
1860 1865 1870 

Met Ser Gly Glu Lys Pro Ser Met Glu Asp Val Val Asn Leu Leu Pro 
1875 1880 1885 

Gly He Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro Thr 
1925 1930 1935 

His Tyr Val Thr Glu Ser Asp Ala Ser Gin Arg Val Thr Gin Leu Leu 
1940 1945 1950 

Gly Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His Asn Trp He 
1955 1960 1965 

Thr Glu Asp Cys Pro He Pro Cys Gly Gly Ser Trp Leu Arg Asp Val 
1970 1975 1980 

Trp Asp Trp Val Cys Thr He Leu Thr Asp Phe Lys Asn Trp Leu Thr 
1985 1990 1995 2000 

Ser Lys Leu Phe Pro Lys Met Pro Gly Leu Pro Phe Val Ser Cys Gin 
2005 2010 2015 
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Lys Gly Tyr Lys Gly Val Trp Ala Gly Thr Gly lie Met Thr Thr Arg 
2020 2025 2030 

Cys Pro Cys GUy Ala Asn He Ser Gly Asn Val Arg Leu Gly Ser Met 
2035 2040 2045 

Arg He Thr Gly Pro Lys Thr Cys Met Asn He Trp Gin Gly Thr Phe 
2050 2055 2060 

Pro He Asn Cys Tyr Thr Glu Gly Gin Cys Val Pro Lys Pro Ala Pro 
2065 2070 2075 2080 

Asn Phe Lys Val Ala He Trp Arg Val Ala Ala Ser Glu Tyr Ala Glu 
2085 2090 2095 



val Thr Gin His Gly Ser Tyr His Tyr He Thr Gly Leu Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Val Pro Cys Gin Leu Pro Ser Pro Glu Phe Phe Ser Trp 
2115 2120 2125 

Val Asp Gly Val Gin He His Arg Phe Ala Pro Thr Pro Lys Pro Phe 
2130 2135 2140 



Phe Arg Asp Glu Val Ser Phe Cys Val Gly Leu Asn Ser Phe Val Val 
2145 2150 2155 2160 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Asp Val Leu Met 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Thr Ala Ala Arg 
2180 2185 2190 



Arg Leu Ala Arg Gly Ser Pro Pro Ser Glu Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Arg Ala Thr Cys Thr Thr His Gly Lys 
2210 2215 2220 

Ala Tyr Asp Val Asp Met Val Asp Ala Asn Leu Phe Met Gly Gly Asp 
2225 2230 2235 2240 



Val Thr Arg He Glu Ser Gly Ser Lys Val Val Val Leu Asp Ser Leu 
2245 2250 2255 

Asp Pro Met Val Glu Glu Arg Ser Asp Leu Glu Pro Ser He Pro Ser 
2260 2265 2270 
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Glu Tyr Met Leu Pro Lys Lys Arg Phe Pro Pro Ala Leu Pro Ala Trp 
2275 2280 2285 

Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Ser Trp Lys Arg Pro 
2290 2295 2300 

Asp Tyr Gin Pro Ala Thr Val Ala Gly Cys Ala Leu Pro Pro Pro Arg 
2305 2310 2315 2320 

Lys Thr Pro Thr Pro Pro Pro Arg Arg Arg Arg Thr Val Gly Leu Ser 
2325 2330 2335 

Glu Asp Ser lie Gly Asp Ala Leu Gin Gin Leu Ala lie Lys Ser Phe 
2340 2345 2350 

Gly Gin Pro Pro Pro Ser Gly Asp Ser Gly Leu Ser Thr Gly Ala Gly 
2355 2360 2365 

Ala Ala Asp Ser Gly Ser Gin Thr Pro Pro Asp Glu Leu Ala Leu Ser 
2370 2375 2380 

Glu Thr Gly Ser lie Ser Ser Met Pro Pro Leu Glu Gly Glu Leu Gly 
2385 2390 2395 2400 

Asp Pro Asp Leu Glu Pro Glu Gin Val Glu Pro Gin Pro Pro Pro Gin 

2405 2410 2415 

Gly Gly Val Ala Ala Pro Gly Ser Asp Ser Gly Ser Trp Ser Thr Cys 
2420 2425 2430 

Ser Glu Glu Asp Asp Ser Val Val Cys Cys Ser Met Ser Tyr Ser Trp 
2435 2440 2445 

Thr Gly Ala Leu He Thr Pro Cys Ser Pro Glu Glu Glu Lys Leu Pro 
2450 2455 2460 

He Asn Pro Leu Ser Asn Ser Leu Leu Arg Tyr His Asn Lys Val Tyr 
2465 2470 2475 2480 

Cys Thr Thr Thr Lys Ser Ala Ser Leu Arg Ala LyB Lys Val Thr Phe 
2485 2490 2495 

Asp Arg Met Gin Val Leu Asp Ser Tyr Tyr Asp Ser Val Leu Lys Asp 
2500 2505 2510 

He Lys Leu Ala Ala Ser Lys Val Thr Ala Arg Leu Leu Thr Met Glu 
2515 2520 2525 
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Glu Ala Cys Gin Leu Thr Pro 
2530 2535 

Phe Qly Ala Lys Glu Val Arg 
2545 2550 



Pro His Ser Ala Arg Ser Lys Tyr Gly 
2540 

Ser Leu Ser Gly Arg Ala Val Asn His 
2555 2560 

Leu Leu Glu Asp Ser Glu Thr Pro lie 
2570 2575 

Abh Glu Val Phe Cys Val Asp Pro Thr 
2585 2590 

Arg Leu lie Val Tyr Pro Asp Leu Gly 
2600 2605 

Ala Leu Tyr Asp lie Thr Gin Lys Leu 



He Lys Ser Val Trp Lys Asp 
2565 

Pro Thr Thr He Met Ala Lys 
2580 

Lys Gly Gly Lys Lys Ala Ala 
2595 

Val Arg val cys Glu Lys Met 
2610 2615 

Pro Gin Ala Val Met Gly Ala Ser 
2625 2630 

Gin Arg Val Glu Phe Leu Leu Lys 
2645 



2620 

Tyr Gly Phe Gin Tyr Ser Pro Ala 

2635 2640 

Ala Trp Ala Glu Lys Lys Asp Pro 

2650 2655 



Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
2660 2665 2670 

Arg Asp He Arg Thr Glu Glu Ser He Tyr Arg Ala Cys Ser Leu Pro 
2675 2680 2685 

Glu Glu Ala His Thr Ala He His Ser Leu Thr Glu Arg Leu Tyr Val 
2690 2695 2700 

Gly Gly Pro Met Phe Asn Ser Lys Gly Gin Thr Cys Gly Tyr Arg Arg 
2705 2710 2715 2720 

Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Met Gly Asn Thr He Thr 
2725 2730 2735 

Cys Tyr Val Lys Ala Leu Ala Ala Cys Lys Ala Ala Gly He He Ala 
2740 2745 2750 

Pro Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Ser Glu Ser 
2755 2760 2765 

Gin Gly Thr Glu Glu Asp Glu Arg Asn Leu Arg Ala Phe Thr Glu Ala 
2770 2775 2780 



15 



WO 00/75338 



PC17US00/15446 



Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Arg Pro Glu Tyr 
278S 2790 2795 2800 

Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu 
2805 2810 2815 

Gly Pro Gin Gly Arg Arg Arg Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
2820 2825 2830 

Pro He Ala Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn 
2835 2840 2845 

Ser Trp Leu Gly Asn lie He Gin Tyr Ala Pro Thr He Trp Ala Arg 
2850 2855 2860 

Met Val Leu Met Thr His Phe Phe Ser He Leu Met Ala Gin Asp Thr 
2865 2870 2875 2880 

Leu Asp Gin Asn Leu Asn Phe Glu Met Tyr Gly Ala Val Tyr Ser Val 
2885 2890 2895 

Ser Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly Leu Asp 
2900 2905 2910 

Ala Phe Ser Leu His Thr Tyr Thr Pro His Glu Leu Thr Arg Val Ala 
2915 2920 2925 

Ser Ala Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg Ala Trp Lys Ser 
2930 2935 2940 

Arg Ala Arg Ala Val Arg Ala Ser Leu He Ser Arg Gly Gly Arg Ala 
2945 2950 2955 2960 

Ala Val Cys Gly Arg Tyr Leu Phe Asn Trp Ala Val Lys Thr Lys Leu 
2965 2970 2975 

Lys Leu Thr Pro Leu Pro Glu Ala Arg Leu Leu Asp Leu Ser Ser Trp 
2980 2965 2990 

Phe Thr Val Gly Ala Gly Gly Gly Asp He Tyr His Ser Val Ser Arg 
2995 3000 3005 

Ala Arg Pro Arg Leu Leu Leu Phe Gly Leu Leu Leu Leu Phe Val Gly 
3010 3015 ' 3020 

Val Gly Leu Phe Leu Leu Pro Ala Arg 
3025 3030 
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<210> 3 
<211> 9611 
<212> DNA 

<213> Hepatitis C virus 



<400> 3 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gataaacccg 
gcaagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaaga aacaccaacc 
gcggccagat cgttggcgga gtatacttgt 
gcgcgacaag gaagacttcg gagcggtccc 
aagatcggcg ctccactggc aaatcctggg 
ggaatgaggg actcggctgg gcaggatggc 
ggggccccaa tgacccccgg cataggtcgc 
cgtgcggctt tgccgacctc atggggtaca 
tcgccagagc tctcgcgcat ggcgtgagag 
ggaacttacc cggttgctcc ttttctatct 
ccccggtctc cgctgccgaa gtgaagaaca 
gcaccaatga cagcattacc tggcagctcc 
tcccgtgcga gaaagtgggg aatgcatctc 
ccgtgcagcg gcccggcgcc ctcacgcagg 
tgtccgccac gctctgctct gccctctacg 
cagcccaaat gttcattgtc tcgccgcagc 
ccatctaccc tggtaccatc actggacacc 
cgcccacggc taccatgatc ttggcgtacg 
tcattagcgg ggctcattgg ggcgtcatgt 
cgtgggcgaa agtcgttgtc atccttctgt 
ctgttggggg ttctgccgcg cagaccaccg 
ccaggcagaa aatccagctc gttaacacca 
tgaactgcaa tgactccttg cacaccggct 
tcaactcgtc aggatgtccc gaacgcatgt 
tgggatgggg cgccttgcaa tatgaggata 
attgctggca ctacccacca aggcagtgtg 
cagtgtactg tttcaccccc agcccagtgg 
ccacttacac gtggggggag aatgagacag 
cgctggggtc atggttcggc tgcacgtgga 
gcgcaccacc ctgccgtact agagctgact 
cggactgttt taggaagcat cctgatacca 
tcacgccaag gtgcctgatc gactacccct 
actataccat cttcaaaata aggatgtatg 
catgcaattt cactcgtggg gatcgttgca 
ctcctttgtt gcactccacc acggaatggg 
ccgccttgtc gactggtctt ctccacctcc 



catgaatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcctgatagg 300 
gaccgtgcac catgagcaca aatcctaaac 360 
gtcgcccaca agacgttaag tttccgggcg 420 
tgccgcgcag gggccccagg ttgggtgtgc 4 80 
agccacgtgg aaggcgccag cccatcccta 540 
gaaaaccagg atacccctgg cccctatacg 600 
tcctgtcccc ccgaggttcc cgtccctctt 660 
gcaacgtggg taaggtcatc gataccctaa 720 
tccctgtcgt gggcgccccg ctcggcggcg 780 
tcctggagga cggggttaat tttgcaacag 840 
tcttgctggc cctgctgtcc tgcatcacca 900 
tcagtaccgg ctacatggtg actaacgact 960 
aggctgctgt cctccacgtc cccgggtgcg 102 0 
agtgctggat accggtctca ccgaatgtgg 1080 
gcttgcggac gcacatcgac atggttgtga 1140 
tgggggacct ct 9cggtggg gtgatgctcg 1200 
accactggtt tgtccaagac tgcaattgct 1260 
gcatggcatg ggacatgatg atgaactggt 1320 
cgatgcgtgt ccccgaggtc attatagaca 1380 
tcggcttggc ctacttctct atgcagggag 1440 
tggccgccgg ggtggacgcg cgcacccata 150 0 
ggcgcctcac cagcttattt gacatgggcc 1560 
atggcagctg gcacatcaac cgcaccgccc 162 0 
ttatcgcgtc tctgttctac acccacagct 1680 
ccgcctgccg cagtatcgag gccttccggg 1740 
atgtcaccaa tccagaggat atgagaccct 1800 
gcgtggtctc cgcgaagact gtgtgtggcc 1860 
tagtgggcac gaccgacagg cttggagcgc 192 0 
atgtcttcct attgaacagc actcgaccac 1980 
tgaactcttc tggctacacc aagacttgcg 204 0 
tcaacgccag cacggacctg ttgtgcccca 2100 
cttacctcaa atgcggctct gggccctggc 2160 
acaggctctg gcattacccc tgcacagtta 2220 
tgggaggggt tgagcacagg ctcacggctg 2280 
acttggagga cagagacaga agtcaactgt 2340 
ccattttacc ttgctcttac tcggacctgc 2400 
accaaaacat cgtggacgta caattcatgt 2460 
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atggcctatc acctgccctc acaaaataca 
tcctgctctt agcggacgcc agggtttgcg 
aggccgaagc agcactagag aagctggtca 
atggcttcct atattttgtc atcttttfccg 
tccccttagc tacctattcc ctcactggcc 
tgccccaaca ggcatatgca ctggacacgg 
ttgtcgggtt aatggcgctg actctgtcgc 
tgtggtggct tcagtatttt ctgaccagag 
ccctcaacgt ccgggggggg cgcgatgccg 
ccctggtatt tgacatcacc aaactactcc 
aagccagttt gcttaaagtc ccctacttcg 
cgctagcgcg gaagatagcc ggaggtcatt 
cgcttactgg cacctatgtg tataaccatc 
gcctgcgaga tctggccgtg gctgtggaac 
tcatcacgtg gggggcagat accgccgcgt 
ctgcccgtag gggccaggag atactgcttg 
ggaggttgct ggcgcccatc acggcgtacg 
taatcaccag cctgactggc cgggacaaaa 
caactgctac ccaaaccttc ctggcaacgt 
acggggccgg aacgaggacc atcgcatcac 
atgtggacca agaccttgtg ggctggcccg 
gtacctgcgg ctcctcggac ctttacctgg 
gccggcgagg tgatagcagg ggtagcctgc 
gctcctcggg gggtccgctg ttgtgccccg 
cggtgtgcac ccgtggagtg gctaaagcgg 
caaccatgag atccccggfcg ttcacggaca 
tccaggtggc ccacctgcat gctcccaccg 
cgtacgcagc ccagggctac aaggtgttgg 
gctttggtgc ttacatgtcc aaggcccatg 
gaacaattac cactggcagc cccatcacgt 
gcgggtgctc aggaggtgct tatgacataa 
ccacatccat cttgggcafcc ggcactgtcc 
tggttgtgct cgccactgct acccctccgg 
aggaggttgc tctgtccacc accggagaga 
aggtgatcaa ggggggaaga catctcatct 
tcgccgcgaa gctggtcgca ttgggcatca 
tgtctgtcat cccgaccagc ggcgatgttg 
gctttaccgg cgacttcgac tctgtgatag 
atttcagcct tgaccctacc tttaccattg 
ccaggactca acgccggggc aggactggca 
caccggggga gcgcccctcc ggcatgttcg 
cgggctgtgc ttggtatgag ctcacgcccg 
tgaacacccc ggggcttccc gtgtgccagg 
cgggcctcac tcatatagat gcccactttt 
ttccttacct ggtagcgtac caagccaccg 
cgtgggacca gatgtggaag tgtttgatcc 
ccctgctata cagactgggc gctgttcaga 
aatacatcat gacatgcatg tcggccgacc 



tcgtccgatg ggagtgggta atactcttat 2520 
cctgcttatg gatgctcatc ttgttgggcc 2580 
tcttgcacgc tgcgagcgca gctagctgca 264 0 
tggctgcttg gtacatcaag ggtcgggtag 2700 
tgtggtcctt tagcctactg ctcctagcat 2760 
aggtggccgc gtcgtgtggc ggcgttgttc 2820 
catattacaa gcgctatatc agctggtgca 2880 
tagaagcgca actgcacgtg tgggttcccc 294 0 
tcatcttact catgtgtgta gtacacccga 3000 
tggccatctt cggacccctt tggattcttc 3060 
tgcgcgttca aggccttctc cggatctgcg 312 0 
acgtgcaaat ggccatcatc aagttagggg 3180 
tcacccctct tcgagactgg gcgcacaacg 3240 
cagtcgtctt ctcccgaatg gagaccaagc 33 00 
gcggtgacat catcaacggc ttgcccgtct 3360 
ggccagccga cggaatggtc tccaaggggt 3420 
cccagcagac gagaggcctc ctagggtgta 34 80 
accaagtgga gggtgaggtc cagatcgtgt 3540 
gcatcaatgg ggtatgctgg actgtctacc 3600 
ccaagggtcc tgtcatccag atgtatacca 3660 
ctcctcaagg ttcccgctca ttgacaccct 3720 
tcacgaggca cgccgatgtc attcccgtgc 3780 
tttcgccccg gcccatttcc tacttgaaag 3840 
cgggacacgc cgtgggccta ttcagggccg 3 900 
tggactttat ccctgtggag aacctaggga 3960 
actcctctcc accagcagtg ccccagagct 4020 
gcagcggtaa gagcaccaag gtcccggctg 4080 
tgctcaaccc ctctgttgct gcaacgctgg 4140 
gggttgatcc taatatcagg accggggtga 4200 
actccaccta cggcaagttc cttgccgacg 4260 
taatttgtga cgagtgccac tccacggatg 4320 
ttgaccaagc agagactgcg ggggcgagac 43 80 
gctccgtcac tgtgtcccat cctaacatcg 4440 
tcccctttta cggcaaggct atccccctzcg 4500 
tctgccactc aaagaagaag tgcgacgagc 4560 
atgccgtggc ctactaccgc ggtcttgacg 4620 
tcgtcgtgtc gaccgatgct ctcatgactg 4680 
actgcaacac gtgtgtcact cagacagtcg 4740 
agacaaccac gctcccccag gatgctgtct 4 800 
gggggaagcc aggcatctat agatttgtgg 4 860 
actcgtccgt cctctgtgag tgctatgacg 4920 
ccgagactac agttaggcta cgagcgtaca 4 980 
accatcttga attttgggag ggcgtcttta 5040 
tatcccagac aaagcagagt ggggagaact 5100 
tgtgcgctag ggctcaagcc cctcccccat 5160 
gccttaaacc caccctccat gggccaacac 5220 
atgaagtcac cctgacgcac ccaatcacca 52 80 
tggaggtcgt cacgagcacc tgggtgctcg 5340 
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ttggcggcgt cctggctgct ctggccgcgt 
tgggcaggat cgtcttgtcc gggaagccgg 
aggagttcga tgagatggaa gagtgctctc 
tgctcgctga gcagttcaag cagaaggccc 
cagaggttat cacccctgct gtccagacca 
agcacatgtg gaatttcatc agtgggatac 
gtaaccccgc cattgcttca ttgatggctt 
ctggccaaac cctcctcttc aacatattgg 
ccggtgccgc tactgccttt gtgggtgctg 
gactggggaa ggtcctcgtg gacattcttg 
ttgtagcatt caagatcatg agcggtgagg 
tgcccgccat cctctcgcct ggagcccttg 
gccggcacgt tggcccgggc gagggggcag 
cctcccgggg gaaccatgtt tcccccacgc 
gcgtcactgc catactcagc agcctcactg 
ggataagctc ggagtgtacc actccatgct 
ggatatgcga ggtgctgagc gactttaaga 
tgcctgggat tccctttgtg tcctgccagc 
gcattatgca cactcgctgc cactgtggag 
cgatgaggat cgtcggtcct aggacctgca 
acgcctacac cacgggcccc tgtactcccc 
ggagggtgtc tgcagaggaa tacgtggaga 
cgggtatgac tactgacaat cttaaatgcc 
cagaattgga cggggtgcgc ctacacaggt 
aggaggtatc attcagagta ggactccacg 
agcccgaacc ggacgtagcc gtgttgacgt 
cagaggcggc cgggagaagg ttggcgagag 
ctagccagct gtccgctcca tctctcaagg 
acgccgagct catagaggct aacctcctgt 
gggttgagtc agagaacaaa gtggtgattc 
aggatgagcg ggaggtctcc gtacctgcag 
gggccccgcc cgtctgggcg cggccggact 
agcctgacta cgaaccacct gtggtccatg 
ctgtgcctcc gcctcggaaa aagcgtacgg 
ccttggccga gcttgccacc aaaagttttg 
acaatacgac aacatcctct gagcccgccc 
agtcctattc ttccatgccc cccctggagg 
ggccatggtc gacggtcagt agtggggccg 
cttattcctg gacaggcgca ctcgtcaccc 
tcaacgcact gagcaactcg ttgctacgcc 
gcagtgcttg ccaaaggcag aagaaagtca 
attaccagga cgtgctcaag gaggtcaaag 
tatccgtaga ggaagcttgc agcctgacgc 
atggggcaaa agacgtccgt tgccatgcca 
ggaaagacct tctggaagac agtgtaacac 
aggttttctg cgttcagcct gagaaggggg 
ccgacctggg cgtgcgcgtg tgcgagaaga 
ccctggccgt gatgggaagc tcctacggat 



attgcctgtc aacaggctgc gtggtcatag 5400 
caattatacc tgacagggag gttctctacc 5460 
agcacttacc gtacatcgag caagggatga 5520 
tcggcctcct gcagaccgcg tcccgccatg 5580 
actggcagaa actcgaggtc ttttgggcga 5640 
aatacttggc gggcctgtca acgctgcctg 5700 
ttacagctgc cgtcaccagc ccactaacca 5760 
99gg9tgggt ggctgcccag ctcgccgccc 5820 
gcctagctgg cgccgccatc ggcagcgttg 5880 
cagggtatgg cgcgggcgtg gcgggagctc 5940 
tcccctccac ggaggacccg gtcaatctgc 6 000 
tagtcggtgt ggtctgcgca gcaatactgc 6060 
tgcaatggat gaaccggcta atagccttcg 6120 
actacgtgcc ggagagcgat gcagccgccc 6180 
taacccagct cctgaggcga ctgcatcagt 6240 
ccggttcctg gctaagggac atctgggact 6300 
cctggctgaa agccaagctc atgccacaac 6360 
gcgggtatag gggggtctgg cgaggagacg 6420 
ctgagatcac tggacatgtc aaaaacggga 6480 
ggaacatgtg gagtgggacg ttccccatta 6 540 
ttcctgcgcc gaactataag ttcgcgctgt 6600 
taaggcgggc gggggacttc cactacgtat 6660 
cgtgccagat cccatcgccc gaatttttca 672 0 
ttgcgccccc ttgcaagccc ttgctgcggg 6780 
agtacccggt ggggtcgcaa ttaccttgcg 6840 
ccatgctcac tgatccctcc catataacag 6 900 
ggtcaccccc ttctatggcc agctcctcgg 6960 
caacttgcac cgccaaccat gactcccctg 7020 
ggaggcagga gatgggcggc aac at caeca 7080 
tggactcctt cgatccgctt gtggcagagg 7140 
aaattctgcg gaagtctegg agattcgccc 7200 
acaacccccc gctagtagag acgtggaaaa 7260 
gctgcccgct accacctcca cggtcccctc 7 320 
tggtcctcac cgaatcaacc ctatctactg 7 380 
gcagctcctc aacttccggc attaegggeg 7440 
cttctggctg cccccccgac tccgacgttg 7500 
gggagcctgg ggatceggat ctcagcgacg 7560 
acaeggaaga tgtcgtgtgc tgctcaatgt 7 620 
cgtgcgctgc ggaagaacaa aaactgccca 7680 
atcacaatct ggtgtattcc accacttcac 7740 
catttgacag actgeaagtt ctggacagcc 7 800 
cagcggcgtc aaaagtgaag getaacttge 7 860 
ccccacattc agccaaatcc aagtttggct 7920 
gaaaggccgt agcccacatc aactccgtgt 7980 
caatagacac taccatcatg gecaagaacg 8040 
gtegtaagee agctcgtctc atcgtgttcc 8100 
cggccctgta cgacgtggtt agcaagctcc 8160 
tccaatactc accaggacag cgggttgaat 8220 
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tcctcgtgca agcgtggaag tccaagaaga ccccgatggg gttctcgtat gatacccgct 8280 

gttttgactc cacagtcact gagagcgaca tccgtacgga ggaggcaatt taccaatgtt 834 0 

gtgacctgga cccccaagcc cgcgtggcca tcaagtccct cactgagagg ctttatgttg 84 00 

ggggccctct taccaattca aggggggaaa actgcggcta ccgcaggtgc cgcgcgagcg 8460 

gcgtactgac aactagctgt ggtaacaccc tcacttgcta catcaaggcc cgggcagcct 8520 

gtcgagccgc agggctccag gactgcacca tgctcgtgtg tggcgacgac ttagtcgtta 8580 

tctgtgaaag tgcgggggtc caggaggacg cggcgagcct gagagccttc acggaggcta 864 0 

tgaccaggta ctccgccccc cccggggacc ccccacaacc agaatacgac ttggagctta 8700 

taacatcatg ctcctccaac gtgtcagtcg cccacgacgg cgctggaaag agggtctact 8760 

accttacccg tgaccctaca acccccctcg cgagagccgc gtgggagaca gcaagacaca 882 0 

ctccagtcaa ttcctggcta ggcaacataa tcatgtttgc ccccacactg tgggcgagga 8880 

tgatactgat gacccatttc tttagcgtcc tcatagccag ggatcagctt gaacaggctc 8940 

ttaactgtga gatctacgga gcctgctact ccatagaacc actggatcta cctccaatca 9000 

ttcaaagact ccatggcctc agcgcatttt cactccacag ttactctcca ggtgaaatca 9060 

atagggtggc cgcatgcctc agaaaacttg gggtcccgcc cttgcgagct tggagacacc 9120 

gggcccggag cgtccgcgct aggcttctgt ccagaggagg cagggctgct atatgtggca 9180 

agtacctctt caactgggca gtaagaacaa agctcaaact cactccaata gcggccgctg 924 0 

gccggctgga cttgtccggt tggttcacgg ctggctacag cgggggagac atttatcaca 930 0 

gcgtgtctca tgcccggccc cgctggttct ggttttgcct actcctgctc gctgcagggg 9360 

taggcatcta cctcctcccc aaccgatgaa ggttggggta aacactccgg cctcttaagc 942 0 

catttcctgt tttttttttt tttttttttt tttttttctt tttttttttc tttcctttcc 9480 

ttcttttttt cctttctttt tcccttcttt aatggtggct ccatcttagc cctagtcacg 9540 

gctagctgtg aaaggtccgt gagccgcatg actgcagaga gtgctgatac tggcctctct 9600 

gcagatcatg t 9611 

<210> 4 
<211> 3015 
<212> PRT 

<213> Hepatitis C virus 
<400> 4 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Asp Arg Arg Ser Thr Gly LyB Ser Trp Gly Lys Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
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85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys lie Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 

Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp He 
225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His He Asp Met Val Val Met ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 285 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met He Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val He He Asp He He Ser Gly Ala His 
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340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys lie Gin Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe lie Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met ser Ala Cys Arg Ser He Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 

Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser 'rrp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
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595 600 605 

Pro Arg Cys I*eu lie Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
610 615 620 

Thr Val Asn Tyr Thr He Phe Lys He Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala He Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin 
690 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 
705 710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 

Ala Cys Leu Trp Met Leu He Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Lys Leu val He Leu Hie Ala Ala ser Ala Ala Ser Cys Asn Gly 
755 760 765 

Phe Leu Tyr Phe Val He Phe Phe Val Ala Ala Trp Tyr He Lys Gly 
770 775 780 

Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr Gly Leu Trp Ser Phe 
785 790 795 800 

Ser Leu Leu Leu Leu Ala Leu Pro Gin Gin Ala Tyr Ala Leu Asp Thr 
80S 810 815 

Glu val Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala 
820 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp 
835 840 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
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850 855 860 

Val Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val He Leu Leu 
865 870 875 880 

Met Cys Val Val His Pro Thr Leu Val Phe Asp He Thr Lys Leu Leu 
885 890 895 

Leu Ala He Phe Gly Pro Leu Trp He Leu Gin Ala Ser Leu Leu Lys 
900 905 910 

Val Pro Tyr Phe val Arg val Gin Gly Leu Leu Arg He Cys Ala Leu 
915 920 925 

Ala Arg Lye He Ala Gly Gly His Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 955 960 

Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val Val Phe Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala 
980 rf 985 990 

Asp Thr Ala Ala Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Arg Gly Gin Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
1010 1015 1020 

Lys Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

ABn Gin Val Glu Gly Glu val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 
1090 1095 1100 

Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
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1105 1110 1115 1120 

Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu 
1125 1130 H35 

Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 
1155 1160 1165 

Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He 
1185 1190 1195 1200 

Pro Val Glu Asn Leu Gly Thr Thr Met Arg. Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu 
1220 *1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 

Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala 
1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro 
1265 1270 1275 1280 

Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Val Ser His Pro Abii He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu 



25 



WO 00/75338 



PCT/US00/15446 



1365 1370 1375 

lie Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val He Lya Gly Gly 

1380 1385 1390 

Arg His Leu lie Phe Cys His Ser Lye Lys Lys Cys Asp Glu Leu Ala 

1395 1400 1405 

Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly 

1410 1415 1420 

Leu Asp val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ser 

1425 1430 1435 1440 

Thr Asp Ala Leu Met Thr Qly Phe Thr Gly Asp Phe Asp Ser Val lie 

1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro 

1460 1465 1470 

Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala val ser Arg 

1475 1480 1485 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 

1490 1495 1500 

Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val 

1505 1510 1515 1520 

Leu Cys Glu cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 

1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 

1540 1545 1550 

Pro Val Cys Gin Asp Hi© Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 

1555 1560 1565 

Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 

1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 

1585 1590 1595 1600 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He 

1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
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1620 1625 1630 

Gly Ala Val Gin Asn Glu val Thr Leu Thr His Pro lie Thr Lys Tyr 
1635 1640 1645 

He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1650 1655 1660 

Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser 
1665 1670 1675 1680 

Thr Gly Cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala He He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 

Arg His Ala Glu Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 

Leu Glu Val Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala 
1780 1785 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr- Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 

Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 1830 1835 1840 

Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp He Leu 

1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
1860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
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1875 1880 1885 



Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 193S 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He 
1955 I960 1965 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 1995 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 2010 2015 

Arg Gly Tyr Arg Gly Val Trp Arg Gly Abp Gly He Met His Thr Arg 
2020 2025 2030 

Cys His Cys Gly Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met 
2035 2040 2045 

Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 
2065 2070 2075 2080 



Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 

2085 2090 2095 

He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 

2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu 

2115 2120 2125 

Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu 
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2130 2135 2140 

Leu Arg Glu Qlu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val 
2145 2150 2155 2160 

Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu lie Glu Ala ABn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn lie Thr Arg Val Glu Ser Glu Asn Lys Val Val lie 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 2270 

Ser Val Pro Ala Glu lie Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala 
2275 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr 
2290 2295 2300 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 2350 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly lie Thr Gly Asp Asn 
2355 2360 2365 

Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
2370 2375 2380 

Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
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2385 2390 2395 2400 

Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly 
2420 2425 2430 

Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro Xle Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr 
2450 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 

Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His lie Asn 
2530 2535 2540 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp Thr 
2545 2550 2555 2560 

Thr He Met Ala Lys Asn Glu Val Phe Cys val Gin Pro Glu Lys Gly 
2565 2570 2575 

Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cye Phe Asp Ser Thr Val Thr Glu Ser Asp 
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2645 2650 2655 

He Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin 
2660 2665 2670 

Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 

Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg 
2690 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 

He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly 
2740 2745 2750 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr 
2755 2760 2765 

Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 

Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly 
2785 2790 2795 2800 

Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp 
2820 2825 2830 

Leu Gly Asn He lie Met Phe Ala Pro Thr Leu Trp Ala Arg Met He 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu 
2850 2855 2860 

Gin Ala Leu Asn Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 
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2900 2905 2910 

Leu Arg Lye Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 2920 2925 

Arg Ser Val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala lie 
2930 2935 2940 

Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lye Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro lie Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg 
2980 2985 2990 

Pro Arg Trp Phe Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

He Tyr Leu Leu Pro Asn Arg 
3010 3015 



<210> 5 
<211> 9611 
<212> DNA 

<213> Hepatitis C virus 
<400> 5 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gaagactggg tcctttcttg gataaaccca 
gcaagactgc tagccgagta gcgttgggtt 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaaga aacaccaacc 
gcggccagat cgttggcgga gtatacttgt 
gcgcgacaag gaagacttcg gagcggtccc 
aagatcggcg ctccactggc aaatcctggg 
ggaatgaggg actcggctgg gcaggatggc 
ggggccccaa tgacccccgg cataggtcgc 
cgtgcggctt tgccgacctc atggggtaca 
tcgccagagc tctcgcgcat ggcgtgagag 
ggaacttacc cggttgctcc ttttctatct 
ccccggtctc cgctgccgaa gtgaagaaca 
gcaccaatga cagcattacc tggcagctcc 
tcccgtgcga gaaagtgggg aatgcatctc 



catgaatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccgg 180 
ctctatgccc ggccatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcctgatagg 300 
gaccgtgcac catgagcaca aatcctaaac 3 60 
gtcgcccaca agacgttaag tttccgggcg 42 0 
tgccgcgcag gggccccagg ttgggtgtgc 4 80 
agccacgtgg aaggcgccag* cccatcccta 54 0 
gaaaaccagg atacccctgg cccctatacg 600 
tcctgtcccc ccgaggttcc cgtccctctt 660 
gcaacgtggg taaggtcatc gataccctaa 720 
tccctgtcgt gggcgccccg ctcggcggcg 780 
tcctggagga cggggttaat tttgcaacag 84 0 
tcttgctggc cctgctgtcc tgcatcacca 900 
tcagtaccgg ctacatggtg actaacgact 960 
aggctgctgt cctccacgtc cccgggtgcg 102 0 
agtgctggat accggtctca ccgaatgtgg 1080 
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ccgtgcagcg gcccggcgcc ctcacgcagg 
tgtccgccac gctctgctct gccctctacg 
cagcccaaat gttcattgtc tcgccgcagc 
ccatctaccc tggtaccatc actggacacc 
cgcccacggc taccatgatc ttggcgtacg 
tcattagcgg ggctcattgg ggcgtcatgt 
cgtgggcgaa agtcgttgtc atccttctgt 
ctgttggggg ttctgccgcg cagaccaccg 
ccacjgcagaa aatccagctc gttaacacca 
tgaactgcaa tgactccttg cacaccggct 
tcaactcgtc aggatgtccc gaacgcatgt 
tgggatgggg cgccttgcaa tatgaggata 
attgctggca ctacccacca aggcagtgtg 
cagtgtactg tttcaccccc agcccagtgg 
ccacttacac gtggggggag aatgagacag 
cgctggggtc atggttcggc tgcacgtgga 
gcgcaccacc ctgccgtact agagctgact 
cggactgttt taggaagcat cctgatacca 
tcacgccaag gtgcctgatc gactacccct 
actataccat cttcaaaata aggatgtatg 
catgcaattt cactcgtggg gatcgttgca 
ctcctttgtt gcactccacc acggaatggg 
ccgccttgtc gactggtctt ctccacctcc 
atggcctatc acctgccctc acaaaataca 
tcctgctctt agcggacgcc agggtttgcg 
aggccgaagc agctttggag aacctcgtaa 
acggtcttgt gtccttcctc gtgttcttct 
tgcccggagc ggtctacgcc ctctacggga 
tgcctcagcg ggcatatgca ctggacacgg 
ttgtcgggtt aatggcgctg actctgtcgc 
tgtggtggct tcagtatttt ctgaccagag 
ccctcaacgt ccgggggggg cgcgatgccg 
ccctggtatt tgacatcacc aaactactcc 
aagccagttt gcttaaagtc ccctacttcg 
cgctagcgcg gaagatagcc ggaggtcatt 
cgcttactgg cacctatgtg tataaccatc 
gcctgcgaga tctggccgtg gctgtggaac 
tcatcacgtg gggggcagat accgccgcgt 
ctgcccgtag gggccaggag atactgcttg 
ggaggttgct ggcgcccatc acggcgtacg 
taatcaccag cctgactggc cgggacaaaa 
caactgctac ccaaaccttc ctggcaacgt 
acggggccgg aacgaggacc atcgcatcac 
atgtggacca agaccttgtg ggctggcccg 
gtacctgcgg ctcctcggac ctttacctgg 
gccggcgagg tgatagcagg ggtagcctgc 
gctcctcggg gggtccgctg ttgtgccccg 
cggtgtgcac ccgtggagtg gctaaagcgg 



gcttgcggac gcacatcgac atggttgtga 1140 
tgggggacct ctgcggtggg gtgatgctcg 1200 
accactggtt tgtccaagac tgcaattgct 1260 
gcatggcatg ggacatgatg atgaactggt 1320 
cgatgcgtgt ccccgaggtc attatagaca 1380 
tcggcttggc ctacttctct atgcagggag 1440 
tggccgccgg ggtggacgcg cgcacccata IS 00 
ggcgcctcac cagcttattt gacatgggcc 1560 
atggcagctg gcacatcaac cgcaccgccc 1620 
ttatcgcgtc tctgttctac acccacagct 1680 
ccgcctgccg cagtatcgag gccttccggg 1740 
atgtcaccaa tccagaggat atgagaccct 1800 
gcgtggtctc cgcgaagact gtgtgtggcc 1860 
tagtgggcac gaccgacagg cttggagcgc 192 0 
atgtcttcct attgaacagc actcgaccac 1980 
tgaactcttc tggctacacc aagacttgcg 2 04 0 
tcaacgccag cacggacctg ttgtgcccca 2100 
cttacctcaa atgcggctct gggccctggc 2160 
acaggctctg gcattacccc tgcacagtta 2220 
tgggaggggt tgagcacagg ctcacggctg 22 80 
acttggagga cagagacaga agtcaactgt 2340 
ccattttacc ttgctcttac tcggacctgc 24 00 
accaaaacat cgtggacgta caattcatgt 2460 
tcgtccgatg ggagtgggta atactcttat 2520 
cctgcttatg gatgctcatc ttgttgggcc 2 580 
tactcaatgc agcatccctg gccgggacgc 2640 
gctttgcgtg gtatctgaag ggtaggtggg 2700 
tgtggcctct cctcctgctc ctgctggcgt 2 760 
aggtggccgc gtcgtgtggc ggcgttgttc 2820 
catattacaa gcgctatatc agctggtgca 2 880 
tagaagcgca actgcacgtg tgggttcccc 2 940 
tcatcttact catgtgtgta gtacacccga 3 000 
tggccatctt cggacccctt tggattcttc 3060 
tgcgcgttca aggccttctc cggatctgcg 3120 
acgtgcaaat ggccatcatc aagttagggg 3180 
tcacccctct tcgagactgg gcgcacaacg 3240 
cagtcgtctt ctcccgaatg gagaccaagc 33(H) 
gcggtgacat catcaacggc ttgcccgtct 3 360 
ggccagccga cggaatggtc tccaaggggt 3420 
cccagcagac gagaggcctc ctagggtgta 3480 
accaagtgga gggtgaggtc cagatcgtgt 3540 
gcatcaatgg ggtatgctgg actgtctacc 3 600 
ccaagggtcc tgtcatccag atgtatacca 3 660 
ctcctcaagg ttcccgctca ttgacaccct 3720 
tcacgaggca cgccgatgtc attcccgtgc 3780 
tttcgccccg gcccatttcc tacttgaaag 3 840 
cgggacacgc cgtgggccta ttcagggccg 3 900 
tggactttat ccctgtggag aacctaggga 3960 
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caaccatgag atccccggtg ttcacggaca 
tccaggtggc ccacctgcat gctcccaccg 
cgtacgcagc ccagggctac aaggtgttgg 
gctttggtgc ttacatgtcc aaggcccatg 
gaacaattac cactggcagc cccatcacgt 
gcgggtgctc aggaggtgct tatgacataa 
ccacatccat cttgggcatc ggcactgtcc 
tggttgtgct cgccactgct acccctccgg 
aggaggttgc tctgtccacc accggagaga 
aggtgatcaa ggggggaaga catctcatct 
tcgccgcgaa gctggtcgca ttgggcatca 
tgtctgtcat cccgaccagc ggcgatgttg 
gctttaccgg cgacttcgac tctgtgatag 
atttcagcct tgaccctacc tttaccattg 
ccaggactca acgccggggc aggactggca 
caccggggga gcgcccctcc ggcatgttcg 
cgggctgtgc ttggtatgag ctcacgcccg 
tgaacacccc ggggcttccc gtgtgccagg 
cgggcctcac fccatatagat gcccactttt 
ttccttacct ggtagcgtac caagccaccg 
cgtgggacca gatgtggaag tgtttgatcc 
ccctgctata cagactgggc gctgttcaga 
aatacatcat gacatgcatg tcggccgacc 
ttggcggcgt cctggctgct ctggccgcgt 
tgggcaggat cgtcttgtcc gggaagccgg 
aggagttcga tgagatggaa gagtgctctc 
tgctcgctga gcagttcaag cagaaggccc 
cagaggttat cacccctgct gtccagacca 
agcacatgtg gaatttcatc agtgggatac 
gtaaccccgc cattgcttca ttgatggctt 
ctggccaaac cctcctcttc aacatattgg 
ccggtgccgc tactgccttt gtgggtgctg 
gactggggaa ggtcctcgtg gacattcttg 
ttgtagcatt caagatcatg agcggtgagg 
tgcccgccat cctctcgcct ggagcccttg 
gccggcacgt tggcccgggc gagggggcag 
cctcccgggg gaaccatgtt tcccccacgc 
gcgtcactgc catactcagc agcctcactg 
ggataagctc ggagtgtacc actccatgct 
ggatatgcga ggtgctgagc gactttaaga 
tgcctgggat tccctttgtg tcctgccagc 
gcattatgca cactcgctgc cactgtggag 
cgatgaggat cgtcggtcct aggacctgca 
acgcctacac cacgggcccc tgtactcccc 
ggagggtgtc tgcagaggaa tacgtggaga 
cgggtatgac tactgacaat cttaaatgcc 
cagaattgga cggggtgcgc ctacacaggt 
aggaggtatc attcagagta ggactccacg 



actcctctcc accagcagtg ccccagagct 4020 
gcagcggtaa gagcaccaag gtcccggctg 4080 
tgctcaaccc ctctgttgct gcaacgctgg 4140 
gggttgatcc taatatcagg accggggtga 42 00 
actccaccta cggcaagttc cttgccgacg 4260 
taatttgtga cgagtgccac tccacggatg 4320 
ttgaccaagc agagactgcg ggggcgagac 4380 
gctccgtcac tgtgtcccat cctaacatcg 4440 
tcccctttta cggcaaggct atccccctcg 4500 
tctgccactc aaagaagaag tgcgacgagc 4560 
atgccgtggc ctactaccgc ggtcttgacg 4 620 
tcgtcgtgtc gaccgatgct ctcatgactg 4 6 BO 
actgcaacac gtgtgtcact cagacagtcg 4740 
agacaaccac gctcccccag gatgctgtct 4 800 
gggggaagcc aggcatctat agatttgtgg 4 860 
actcgtccgt cctctgtgag tgctatgacg 4 920 
ccgagactac agttaggcta cgagcgtaca 4980 
accatcttga attttgggag ggcgtcttta 5040 
tatcccagac aaagcagagt ggggagaact 5100 
tgtgcgctag ggctcaagcc cctcccccat 5160 
gccttaaacc caccctccat gggccaacac 5220 
atgaagtcac cctgacgcac ccaatcacca 5280 
tggaggtcgt cacgagcacc tgggtgctcg 534 0 
attgcctgtc aacaggctgc gtggtcatag 5400 
caattatacc tgacagggag gttctctacc 5460 
agcacttacc gtacatcgag caagggatga S52 0 
tcggcctcct gcagaccgcg tcccgccatg 5580 
actggcagaa actcgaggtc ttttgggcga 564 0 
aatacttggc gggcctgtca acgctgcctg 5700 
ttacagctgc cgtcaccagc ccactaacca 5760 
gggggtgggt ggctgcccag ctcgccgccc 5820 
gcctagctgg cgccgccatc ggcagcgttg 5880 
cagggtatgg cgcgggcgtg gcgggagctc 5940 
tcccctccac ggaggacctg gtcaatctgc 6000 
tagtcggtgt ggtctgcgca gcaatactgc 6060 
tgcaatggat gaaccggcta atagccttcg 6120 
actacgtgcc ggagagcgat gcagccgccc 6180 
taacccagct cctgaggcga ctgcatcagt 6240 
ccggttcctg gctaagggac atctgggact 6300 
cctggctgaa agccaagctc atgccacaac 6360 
gcgggtatag gggggtctgg cgaggagacg 6420 
ctgagatcac tggacatgtc aaaaacggga 6480 
ggaacatgtg gagtgggacg ttccccatta 654 0 
ttcctgcgcc gaactataag ttcgcgctgt 6600 
taaggcgggt gggggacttc cactacgtat 6660 
cgtgccagat cccatcgccc gaatttttca 6720 
ttgcgccccc ttgcaagccc ttgctgcggg 6780 
agtacccggt ggggtcgcaa ttaccttgcg 684 0 
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agcccgaacc ggacgtagcc gtgttgacgt 
cagaggcggc cgggagaagg ttggcgagag 
ctagccagct gtccgctcca tctctcaagg 
acgccgagct catagaggct aacctcctgt 
gggttgagtc agagaacaaa gtggtgattc 
aggatgagcg ggaggtctcc gtacctgcag 
gggccctgcc cgtctgggcg cggccggact 
agcctgacta cgaaccacct gtggtccatg 
ctgtgcctcc gcctcggaaa aagcgtacgg 
ccttggccga gcttgccacc aaaagttttg 
acaatacgac aacatcctct gagcccgccc 
agtcctattc ttccatgccc cccctggagg 
ggtcatggtc gacggtcagt agtggggccg 
cttattcctg gacaggcgca ctcgtcaccc 
tcaacgcact gagcaactcg ttgctacgcc 
gcagtgcttg ccaaaggcag aagaaagtca 
attaccagga cgtgctcaag gaggtcaaag 
tatccgtaga ggaagcttgc agcctgacgc 
atggggcaaa agacgtccgt tgccatgcca 
ggaaagacct tctggaagac agtgtaacac 
aggttttctg cgttcagcct gagaaggggg 
ccgacctggg cgtgcgcgtg tgcgagaaga 
ccctggccgt gatgggaagc tcctacggat 
tcctcgtgca agcgtggaag tccaagaaga 
gttttgactc cacagtcact gagagcgaca 
gtgacctgga cccccaagcc cgcgtggcca 
ggggccctct taccaattca aggggggaaa 
gcgtactgac aactagctgt ggtaacaccc 
gtcgagccgc agggctccag gactgcacca 
tctgtgaaag tgcgggggtc caggaggacg 
tgaccaggta ctccgccccc cccggggacc 
taacatcatg ctcctccaac gtgtcagtcg 
accttacccg tgaccctaca acccccctcg 
ctccagtcaa ttcctggcta ggcaacataa 
tgatactgat gacccatttc tttagcgtcc 
ttaactgtga gatctacgga gcctgctact 
ttcaaagact ccatggcctc agcgcatttt 
atagggtggc cgcatgcctc agaaaacttg 
gggcccggag cgtccgcgct aggcttctgt 
agtacctctt caactgggca gtaagaacaa 
gccggctgga cttgtccggt tggttcacgg 
gcgtgtctca tgcccggccc cgctggttct 
taggcatcta cctcctcccc aaccgatgaa 
catttcctgt tttttttttt tttttttttt 
ttcttttttt cctttctttt tcccttcttt 
gctagctgtg aaaggtccgt gagccgcatg 
gcagatcatg t 



ccatgctcac tgatccctcc catataacag 6900 
ggtcaccccc ttctatggcc agctcctcgg 6960 
caacttgcac cgccaaccat gactcccctg 7020 
ggaggcagga gatgggcggc aacatcacca 7080 
tggactcctt cgatccgctt gtggcagagg 7140 
aaattctgcg gaagtctcgg agattcgccc 7200 
acaacccccc gctagtagag acgtggaaaa 7260 
gctgcccgct accacctcca cggtcccctc 7320 
tggtcctcac cgaatcaacc ctatctactg 7380 
gcagctcctc aacttccggc attacgggcg 744 0 
cttctggctg cccccccgac tccgacgttg 7500 
gggagcctgg ggatccggat ctcagcgacg 7560 
acacggaaga tgtcgtgtgc tgctcaatgt 7620 
cgtgcgctgc ggaagaacaa aaactgccca 7680 
atcacaatct ggtgtattcc accacttcac 774 0 
catttgacag actgcaagtt ctggacagcc 7800 
cagcggcgtc aaaagtgaag gctaacttgc 7860 
ccccacattc agccaaatcc aagtttggct 7920 
gaaaggccgt agcccacatc aactccgtgt 7980 
caatagacac taccatcatg gccaagaacg 804 0 
gtcgtaagcc agctcgtctc atcgtgttcc 8100 
tggccctgta cgacgtggtt agcaagctcc 8160 
tccaatactc accaggacag cgggttgaat 8220 
ccccgatggg gttctcgtat gatacccgct 82 80 
tccgtacgga ggaggcaatt taccaatgtt 8340 
tcaagtccct cactgagagg ctttatgttg 84 00 
actgcggcta ccgcaggtgc cgcgcgagcg 84 60 
tcacttgcta catcaaggcc cgggcagcct 852 0 
tgctcgtgtg tggcgacgac ttagtcgtta 8580 
cggcgagcct gagagccttc acggaggcta 864 0 
ccccacaacc agaatacgac ttggagctta 8700 
cccacgacgg cgctggaaag agggtctact 8760 
cgagagccgc gtgggagaca gcaagacaca 8 820 
tcatgtttgc ccccacactg tgggcgagga 8 8 80 
tcatagccag ggatcagctt gaacaggctc 8940 
ccatagaacc actggatcta cctccaatca 9000 
cactccacag ttactctcca ggtgaaatca 9060 
gggtcccgcc cttgcgagct tggagacacc 9120 
ccagaggagg cagggctgct atatgtggca 9180 
agctcaaact cactccaata gcggccgctg 9240 
ctggctacag cgggggagac atttatcaca 93 00 
ggttttgcct actcctgctc gctgcagggg 9360 
ggttggggta aacactccgg cctcttaagc 9420 
tttttttctt tttttttttc tttcctttcc 9480 
aatggtggct ccatcttagc cctagtcacg 9540 
actgcagaga gtgctgatac tggcctctct 9600 

9611 
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<210> 6 
<2ll> 3015 
<212> PRT 

<213> Hepatitis C virus 
<400> 6 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn val Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys lie Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin -Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 
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Gly Cys Val Pro Cys Glu Lys val Gly Asn Ala Ser Gin Cys Trp He 
225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 285 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn cys Ser He Tyr Pro Gly Thr He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met He Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val He He Asp He He Ser Gly Ala His 
340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val He Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys He Gin Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe He Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser He Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn val Thr Asn 
465 470 475 480 
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Pro Qlu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 450 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
595 600 60S 

Pro Arg Cys Leu lie Asp Tyr Pro Tyr Arg Leu Trp His T/r Pro Cys 
610 615 620 

Thr val Asn Tyr Thr lie Phe Lys lie Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala lie Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin 
690 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 
705 710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 
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Ala Cys Leu Trp Met Leu lie Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Asn Leu Val lie Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly 
755 760 765 

Leu Val Ser Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly 
770 775 780 

Arg Trp Val Pro Gly Ala Val Tyr Ala Leu Tyr Gly Met Trp Pro Leu 
785 790 795 800 

Leu Leu Leu Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr 
805 810 815 

Glu Val Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala 
820 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lys Arg Tyr lie Ser Trp Cys Met Trp 
835 840 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
850 855 860 

Val Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val He Leu Leu 
865 870 875 880 

Met Cys val Val His Pro Thr Leu Val Phe Asp He Thr Lys Leu Leu 
885 890 895 

Leu Ala He Phe Gly Pro Leu Trp He Leu Gin Ala Ser Leu Leu Lys 
900 905 910 

Val Pro Tyr Phe Val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu 
915 920 925 

Ala Arg Lys He Ala Gly Gly His Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 955 960 

Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val val Phe Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala 
980 985 990 
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Asp Thr Ala Ala Cys Gly Asp lie He Asn Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Arg Gly Gin Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
1010 1015 1020 

Lys Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

Asn Gin Val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 
1090 1095 1100 

Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
1105 1110 1115 1120 

Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu 
1125 1130 1135 

Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 
1155 1160 1165 

Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala LyB Ala Val Asp Phe He 
1185 1190 1195 1200 

Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu 
1220 1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 
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Ala Ala Gin Gly Tyr Lys Val Leu val Leu Asn Pro Ser Val Ala Ala 
1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro 
1265 1270 1275 1280 

Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Val Ser His Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu 
1365 1370 1375 

He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val lie Lys Gly Gly 
1380 1385 1390 

Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ser 
1425 1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

* 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 
1490 1495 1500 
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Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser val 
1505 1510 1515 1520 

Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 
1540 1545 1550 

Pro val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1585 1590 1595 1600 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
1620 1625 1630 

Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr 
1635 1640 1645 

lie Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1650 1655 1660 

Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser 
1665 1670 1675 1680 

Thr Gly Cys Val Val lie Val Gly Arg He Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala He He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 

Arg His Ala Glu Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 
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Leu Glu Val Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly lie 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala 
1780 1765 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 

Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 . 1830 1835 1840 

Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp lie Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys lie 
1860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
1875 1880 1885 

Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 1935 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He 
1955 1960 1965 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He CyB Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 199S 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 2010 2015 
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Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly lie Met His Thr Arg 
2020 2025 2030 

Cys His Cys Gly Ala Glu lie Thr Gly His Val Lys Asn Gly Thr Met 
2035 2040 2045 

Arg lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 
2065 2070 2075 2080 

Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 
2085 2090 2095 

lie Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin lie Pro Ser Pro Glu Phe Phe Thr Glu 
2115 2120 2125 

Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu 
2130 2135 2140 

Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val 
2145 2150 2155 2160 

Gly ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg 
2180 ^ 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys Val Val He 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 2270 
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Ser Val Pro Ala Glu lie Leu Arg Lye Ser Arg Arg Phe Ala Arg Ala 
2275 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr 
2290 2295 2300 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 2350 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn 
2355 2360 2365 

Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
2370 2375 2380 

Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
2385 2390 2395 2400 

Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val cys Cys Ser Met ser Tyr Ser Trp Thr Gly 
2420 2425 2430 

Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro He Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His HiB Asn Leu Val Tyr Ser Thr 
2450 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 * 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 
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Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn 
2530 2535 2S40 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp Thr 
2545 2550 2555 2560 

Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly 
2565 2570 2575 

Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cys Phe Abp Ser Thr Val Thr Glu Ser Asp 
2645 2650 2655 

He Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin 
2660 2665 2670 

Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 

Pro Leu Thr Asn Ser Arg Gly Glu ABn Cys Gly Tyr Arg Arg Cys Arg 
2690 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 

He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly 
2740 2745 2750 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr 
2755 2760 2765 

Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 
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Glu I»eu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly 
2785 2790 2795 2800 

Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp 
2820 2825 2830 

Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp Ala Arg Met He 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu 
2850 2855 2860 

Gin Ala Leu Asn Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 

Ser Leu His Ser Tyr Ser Pro Qly Glu He Asn Arg Val Ala Ala Cys 
2900 2905 2910 

Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 2920 2925 

Arg Ser Val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He 
2930 2935 2940 

Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro He Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg 
2980 2985 2990 

Pro Arg Trp Phe Trp Phe CyB Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

He Tyr Leu Leu Pro Asn Arg 
3010 3015 
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<211> 9611 
<212> DNA 

<213> Hepatitis C virus 



<400> 7 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gataaacccg 
gcaagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaaga aacaccaacc 
gcggccagat cgttggcgga gtatacttgt 
gcgcgacaag gaagacttcg gagcggtccc 
aagatcggcg ctccactggc aaatcctggg 
ggaatgaggg actcggctgg gcaggatggc 
ggggccccaa tgacccccgg cataggtcgc 
cgtgcggctt tgccgacctc atggggtaca 
tcgccagagc tctcgcgcat ggcgtgagag 
ggaacttacc cggttgctcc ttttctatct 
ccccggtctc cgctgccgaa gtgaagaaca 
gcaccaatga cagcattacc tggcagctcc 
tcccgtgcga gaaagtgggg aatgcatctc 
ccgtgcagcg gcccggcgcc ctcacgcagg 
tgtccgccac gctct^ctct gccctctacg 
cagcccaaat gttcattgtc tcgccgcagc 
ccatctaccc tggtaccatc ■ actggacacc 
cgcccacggc taccatgatc ttggcgtacg 
tcattagcgg ggctcattgg ggcgtcatgt 
cgtgggcgaa agtcgttgtc atccttctgt 
ctgttggggg ttctgccgcg cagaccaccg 
ccaggcagaa aatccagctc gttaacacca 
tgaactgcaa tgactccttg cacaccggct 
tcaactcgtc aggatgtccc gaacgcatgt 
tgggatgggg cgccttgcaa tatgaggata 
attgctggca ctacccacca aggcagtgtg 
cagtgtactg tttcaccccc agcccagtgg 
ccacttacac gtggggggag aatgagacag 
cgctggggtc atggttcggc tgcacgtgga 
gcgcaccacc ctgccgtact agagctgact 
cggactgttt taggaagcat cctgatacca 
tcacgccaag gtgcctgatc gactacccct 
actataccat cttcaaaata aggatgtatg 
catgcaattt cactcgtggg gatcgttgca 
ctcctttgtt gcactccacc acggaatggg 
ccgccttgtc gactggtctt ctccacctcc 
atggcctatc acctgccctc acaaaataca 
tcctgctctt agcggacgcc agggtttgcg 



catgaatcac tcccctgtga ggaactactg €0 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcctgatagg 300 
gaccgtgcac catgagcaca aatcctaaac 360 
gtcgcccaca agacgttaag tttccgggcg 420 
tgccgcgcag gggccccagg ttgggtgtgc 4 80 
agccacgtgg aaggcgccag cccatcccta 540 
gaaaaccagg atacccctgg cccctatacg 600 
tcctgtcccc ccgaggttcc cgtccctctt 660 
gcaacgtggg taaggtcatc gataccctaa 720 
tccctgtcgt gggcgccccg ctcggcggcg 780 
tcctggagga cggggttaat tttgcaacag 840 
tcttgctggc cctgctgtcc tgcatcacca 900 
tcagtaccgg ctacatggtg actaacgact 960 
aggctgctgt cctccacgtc cccgggtgcg 102 0 
agtgctggat accggtctca ccgaatgtgg 108 0 
gcttgcggac gcacatcgac atggttgtga 114 0 
tgggggacct ctgcggtggg gtgatgctcg 1200 
accactggtt tgtccaagac tgcaattgct 1260 
gcatggcatg ggacatgatg atgaactggt 1320 
cgatgcgtgt ccccgaggtc attatagaca 13 80 
tcggcttggc ctacttctct atgcagggag 144 0 
tggccgccgg ggtggacgcg cgcacccata 1500 
ggcgcctcac cagcttattt gacatgggcc 1560 
atggcagctg gcacatcaac cgcaccgccc 1620 
ttatcgcgtc tctgttctac acccacagct 1680 
ccgcctgccg cagtatcgag gccttccggg 1740 
atgtcaccaa tccagaggat atgagaccct 1800 
gcgtggtctc cgcgaagact gtgtgtggcc 1860 
tagtgggcac gaccgacagg cttggagcgc 1920 
atgtcttcct attgaacagc actcgaccac i960 
tgaactcttc tggctacacc aagacttgcg 2040 
tcaacgccag cacggacctg ttgtgcccca 2100 
cttacctcaa atgcggctct gggccctggc 2160 
acaggctctg gcattacccc tgcacagtta 2220 
tgggaggggt tgagcacagg ctcacggctg 2280 
acttggagga cagagacaga agtcaactgt 2340 
ccattttacc ttgctcttac tcggacctgc 2400 
accaaaacat cgtggacgta caattcatgt 2460 
tcgtccgatg ggagtgggta atactcttat 2520 
cctgcttatg gatgctcatc ttgttgggcc 2580 
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aggccgaagc agcactagag aagctggtca 
atggcttcct atattttgtc atctttttcg 
tccccttagc tacctattcc ctcactggcc 
tgccccaaca ggcatatgca ctggacacgg 
ttgtcgggtt aatggcgctg actctgtcgc 
tgtggtggct tcagtatttt ctgaccagag 
ccctcaacgt ccgggggggg cgcgatgccg 
ccctggtatt tgacatcacc aaactactcc 
aagccagttt gcttaaagtc ccctacttcg 
cgctagcgcg gaagatagcc ggaggtcatt 
cgcttactgg cacctatgtg tataaccatc 
gcctgcgaga tctggccgtg gctgtggaac 
tcatcacgtg gggggcagat accgccgcgt 
ctgcccgtag gggccaggag atactgcttg 
ggaggttgct ggcgcccatc acggcgtacg 
taatcaccag cctgactggc cgggacaaaa 
caactgctac ccaaaccttc ctggcaacgt 
acggggccgg aacgaggacc atcgcatcac 
atgtggacca agaccttgtg ggctggcccg 
gtacctgcgg ctcctcggac ctttacctgg 
gccggcgagg tgatagcagg ggtagcctgc 
gctcctcggg gggtccgcfcg ttgtgccccg 
cggtgtgcac ccgtggagtg gctaaagcgg 
caaccatgag atccccggtg ttcacggaca 
tccaggtggc ccacctgcat gctcccaccg 
cgtacgcagc ccagggctac aaggtgttgg 
gctttggtgc ttacatgtcc aaggcccatg 
gaacaattac cactggcagc cccatcacgt 
gcgggtgctc aggaggtgct tatgacataa 
ccacatccat cttgggcatc ggcactgtcc 
tggttgtgct cgccactgct acccctccgg 
aggaggttgc tctgtccacc accggagaga 
aggtgatcaa ggggggaaga catctcatct 
tcgccgcgaa gctggtcgca ttgggcatca 
tgtctgtcat cccgaccagc ggcgatgttg 
gctttaccgg cgacttcgac tctgtgatag 
atttcagcct tgaccctacc tttaccattg 
ccaggactca acgccggggc aggactggca 
caccggggga gcgcccctcc ggcatgttcg 
cgggctgtgc ttggtatgag ctcacgcccg 
tgaacacccc ggggcttccc gtgtgccagg 
cgggcctcac tcatatagat gcccactttt 
ttccttacct ggtagcgtac caagccaccg 
cgtgggacca gatgtggaag tgtttgatcc 
ccctgctata cagactgggc gctgttcaga 
aatacatcat gacatgcatg tcggccgacc 
ttggcggcgt cctggctgct ctggccgcgt 
tgggcaggat cgtcttgtcc gggaagccgg 



tcttgcacgc tgcgagcgca gctagctgca 2640 
tggctgcttg gtacatcaag ggtcgggtag 2700 
tgtggtcctt tagcctactg ctcctagcat 2760 
aggtggccgc gtcgtgtggc ggcgttgttc 2820 
catattacaa gcgctatatc agctggtgca 2880 
tagaagcgca actgcacgtg tgggttcccc 2 94 0 
tcatcttact catgtgtgta gtacacccga 300 0 
tggccatctt cggacccctt tggattcttc 3060 
tgcgcgttca aggccttctc cggatctgcg 312 0 
acgtgcaaat ggccatcatc aagttagggg 3180 
tcacccctct tcgagactgg gcgcacaacg 3240 
cagtcgtctt ctcccgaatg gagaccaagc 3300 
gcggtgacat catcaacggc ttgcccgtct 3360 
ggccagccga cggaatggtc tccaaggggt 3420 
cccagcagac gagaggcctc ctagggtgta 3480 
accaagtgga gggtgaggtc cagatcgtgt 3540 
gcatcaatgg ggtatgctgg actgtctacc 3 600 
ccaagggtcc tgtcatccag atgtatacca 3 660 
ctcctcaagg ttcccgctca ttgacaccct 3720 
tcacgaggca cgccgatgtc attcccgtgc 3780 
tttcgccccg gcccatttcc tacttgaaag 3 84 0 
cgggacacgc cgtgggccta ttcagggccg 3 900 
tggactttat ccctgtggag aacctaggga 3960 
actcctctcc accagcagtg ccccagagct 4 02 0 
gcagcggtaa gagcaccaag gtcccggctg 4080 
tgctcaaccc ctctgttgct gcaacgctgg 414 0 
gggttgatcc taatatcagg accggggtga 4200 
actccaccta cggcaagttc cttgccgacg 4260 
taatttgtga cgagtgccac tccacggatg 4 320 
ttgaccaagc agagactgcg ggggcgagac 4380 
gctccgtcac tgtgtcccat ectaacatcg 444 0 
tcccctttta cggcaaggct atccccctcg 4500 
tctgccactc aaagaagaag tgcgacgagc 4560 
atgccgtggc ctactaccgc ggtcttgacg 4620 
tcgtcgtgtc gaccgatgct ctcatgactg 4680 
actgcaacac gtgtgtcact cagacagtcg 474 0 
agacaaccac gctcccccag gatgctgtct 4800 
99999 aa 9cc aggcatctat agatttgtgg 4860 
actcgtccgt cctctgtgag tgctatgacg 492 0 
ccgagactac agttaggcta cgagcgtaca 4980 
accatcttga attttgggag ggcgtcttta 5040 
tatcccagac aaagcagagt ggggagaact 5100 
tgtgcgctag ggctcaagcc cctcccccat 5160 
gccttaaacc caccctccat gggccaacac 5220 
atgaagtcac cctgacgcac ccaatcacca 52 80 
tggaggtcgt cacgagcacc tgggtgctcg 5340 
attgcctgtc aacaggctgc gtggtcatag 54 00 
caattatacc tgacagggag gttctctacc 54 60 
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aggagttcga tgagatggaa gagtgctctc 
tgctcgctga gcagttcaag cagaaggccc 
cagaggttat cacccctgct gtccagacca 
agcacatgtg gaatttcatc agtgggatac 
gtaaccccgc cattgcttca ttgatggctt 
ctggccaaac cctcctcttc aacatattgg 
ccggtgccgc tactgccttt gtgggtgctg 
gactggggaa ggtcctcgtg gacattcttg 
ttgtagcatt caagatcatg agcggtgagg 
tgcccgccat cctctcgcct ggagcccttg 
gccggcacgt tggcccgggc gagggggcag 
cctcccgggg gaaccatgtt tcccccacgc 
gcgtcactgc catactcagc agcctcactg 
ggataagctc ggagtgtacc actccatgct 
ggatatgcga ggtgctgagc gactttaaga 
tgcctgggat tccctttgtg tcctgccagc 
gcattatgca cactcgctgc cactgtggag 
cgatgaggat cgtcggtccfc aggacctgca 
acgcctacac cacgggcccc tgtactcccc 
ggagggtgtc tgcagaggaa tacgtggaga 
cgggtatgac tactgacaat cttaaatgcc 
cagaattgga cggggtgcgc ctacacaggt 
aggaggtatc attcagagta ggactccacg 
agcccgaacc ggacgtagcc gtgttgacgt 
cagaggcggc cgggagaagg ttggcgagag 
ctagccagct gtccgctcca tctctcaagg 
acgccgagct catagaggct aacctcctgt 
gggttgagtc agagaacaaa gtggtgattc 
aggatgagcg ggaggtctcc gtacctgcag 
gggccctgcc cgtctgggcg cggccggact 
agcctgacta cgaaccacct gtggtccatg 
ctgtgcctcc gcctcggaaa aagcgtacgg 
ccttggccga gcttgccacc aaaagttttg 
acaatacgac aacatcctct gagcccgccc 
agtcctattc ttccatgccc cccctggagg 
ggtcatggtc gacggtcagt agtggggccg 
cttattcctg gacaggcgca ctcgtcaccc 
tcaacgcact gagcaactcg ttgctacgcc 
gcagtgcttg ccaaaggcag aagaaagtca 
attaccagga cgtgctcaag gaggtcaaag 
tatccgtaga ggaagcttgc agcctgacgc 
atggggcaaa agacgtccgt tgccatgcca 
ggaaagacct tctggaagac agtgtaacac 
aggttttctg cgttcagcct gagaaggggg 
ccgacctggg cgtgcgcgtg tgcgagaaga 
ccctggccgt gatgggaagc tcctacggat 
tcctcgtgca agcgtggaag tccaagaaga 
gttttgactc cacagtcact gagagcgaca 



agcacttacc gtacatcgag caagggatga 5520 
tcggcctcct gcagaccgcg tcccgccatg 5580 
actggcagaa actcgaggtc ttttgggcga 564 0 
aatacttggc gggcctgtca acgctgcctg 5700 
ttacagctgc cgtcaccagc ccactaacca 5760 
gggggtgggt ggctgcccag ctcgccgccc 5820 
gcctagctgg cgccgccatc ggcagcgttg 5880 
cagggtatgg cgcgggcgtg gcgggagctc 594 0 
tcccctccac ggaggacctg gtcaatctgc 6000 
tagtcggtgt ggtctgcgca gcaatactgc 6060 
tgcaatggat gaaccggcta atagccttcg 6120 
actacgtgcc ggagagcgat gcagccgccc 6180 
taacccagct cctgaggcga ctgcatcagt 6240 
ccggttcctg gctaagggac atctgggact 6300 
cctggctgaa agccaagctc atgccacaac 6360 
gcgggtatag gggggtctgg cgaggagacg 6420 
ctgagatcac tggacatgtc aaaaacggga 6480 
ggaacatgtg gagtgggacg ttccccatta 6540 
ttcctgcgcc gaactataag ttcgcgctgt 6600 
taaggcgggt gggggacttc cactacgtat 6 660 
cgtgccagat cccatcgccc gaatttttca 6720 
ttgcgccccc ttgcaagccc ttgctgcggg 6780 
agtacccggt ggggtcgcaa ttaccttgcg 6840 
ccatgctcac tgatccctcc catataacag 6900 
ggtcaccccc ttctatggcc agctcctcgg 6960 
caacttgcac cgccaaccat gactcccctg 7020 
ggaggcagga gatgggcggc aacat caeca 7080 
tggactcctt cgatccgctt gtggcagagg 7140 
aaattctgcg gaagtctegg agattcgccc 7200 
acaacccccc gctagtagag acgtggaaaa 7260 
gctgcccgct accacctcca cggtcccctc 7320 
tggtcctcac cgaatcaacc ctatctactg 73 80 
gcagctcctc aacttccggc attaegggeg 7440 
cttctggctg cccccccgac tccgacgttg 7 500 
gggagcctgg ggatceggat ctcagcgacg 7560 
acaeggaaga tgtcgtgtgc tgctcaatgt 7620 
cgtgcgctgc ggaagaacaa aaactgccca 7680 
atcacaatct ggtgtattcc accacttcac 7740 
catttgacag actgeaagtt ctggacagcc 7800 
cagcggcgtc aaaagtgaag getaacttge 7860 
ccccacattc agccaaatcc aagtttggct 7 920 
gaaaggccgt agcccacatc aactccgtgt 7980 
caatagacac taccatcatg gecaagaacg 8040 
gtegtaagee agctcgtctc atcgtgttcc 8100 
tggccctgta cgacgtggtt agcaagctcc 8160 
tccaatactc accaggacag cgggttgaat 8220 
ccccgatggg gttctegtat gatacccgct 8280 
teegtaegga ggaggcaatt taccaatgtt 8340 



50 



WO 00/75338 



PCT/US00/15446 



gtgacctgga cccccaagcc cgcgtggcca tcaagtccct cactgagagg ctttatgttg 84 00 
ggggccctct taccaattca aggggggaaa actgcggcta ccgcaggtgc cgcgcgagcg 84 60 
gcgtactgac aactagctgt ggtaacaccc tcacttgcta catcaaggcc cgggcagcct 8520 
gtcgagccgc agggctccag gactgcacca tgctcgtgtg tggcgacgac ttagtcgtta 8580 
tctgtgaaag tgcgggggtc caggaggacg cggcgagcct gagagccttc acggaggcta 8640 
tgaccaggta ctccgccccc cccggggacc ccccacaacc agaatacgac ttggagctta 8700 
taacatcatg ctcctccaac gtgtcagtcg cccacgacgg cgctggaaag agggtctact 8760 
accttacccg tgaccctaca acccccctcg cgagagccgc gtgggagaca gcaagacaca 8820 
ctccagtcaa ttcctggcta ggcaacataa tcatgtttgc ccccacactg tgggcgagga 88B0 
tgatactgat gacccatttc tttagcgtcc tcatagccag ggatcagctt gaacaggctc 8 940 
ttaactgtga gatctacgga gcctgctact ccatagaacc actggatcta cctccaatca 9000 
ttcaaagact ccatggcctc agcgcatttt cactccacag ttactctcca ggtgaaatca 9060 
atagggtggc cgcatgcctc agaaaacttg gggtcccgcc cttgcgagct tggagacacc 9120 
gggcccggag cgtccgcgct aggcttctgt ccagaggagg cagggctgct atatgtggca 9180 
agtacctctt caactgggca gtaagaacaa agctcaaact cactccaata gcggccgctg 9240 
gccggctgga cttgtccggt tggttcacgg ctggctacag cgggggagac atttatcaca 9300 
gcgtgtctca tgcccggccc cgctggttct ggttttgcct actcctgctc gctgcagggg 9360 
taggcatcta cctcctcccc aaccgatgaa ggttggggta aacactccgg cctcttaagc 9420 
catttcctgt tttttttttt tttttttttt tttttttctt tttttttttc tttcctttcc 9480 
ttcttttttt cctttctttt tcccttcttt aatggtggct ccatcttagc cctagtcacg 9540 
gctagctgtg aaaggtccgt gagccgcatg actgcagaga gtgctgatac tggcctctct 9600 
gcagatcatg t 9611 

<210> 8 
<211> 3015 
<212> PRT 

<213> Hepatitis C virus 
<400> 8 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
IS 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
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35 40 
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Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 
55 60 
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Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp 
70 75 
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Lys Pro Gly 
80 
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Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly 
85 90 
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Ala Gly Trp 
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Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys . Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 

Gly Cys val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp He 
225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 285 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 



Asn Cys Ser He Tyr Pro Gly Thr 
305 310 

Asp Met Met Met Asn Trp Ser Pro 
325 

Ala Met Arg Val Pro Glu Val He 
340 



He Thr Gly His Arg Met Ala Trp 
315 320 

Thr Ala Thr Met He Leu Ala Tyr 
330 335 

He Asp He He Ser Gly Ala His 
345 350 
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Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys lie Gin Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe lie Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser lie Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 

Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys <31y Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 565 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
595 600 605 
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Pro Arg Cys Leu lie Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
610 615 620 

Thr Val Asn Tyr Thr lie Phe Lys lie Arg Met Tyr val Qly Gly val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Abp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala lie Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin 
690 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 
705 710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 

Ala Cys Leu Trp Met Leu He Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Lys Leu Val He Leu His Ala Ala Ser Ala Ala Ser Cys Asn Gly 
755 760 765 

Phe Leu Tyr Phe val He Phe Phe Val Ala Ala Trp Tyr He Lys Gly 
770 775 780 

Arg Val Val Pro Leu Ala Thr Tyr ser Leu Thr Gly Leu Trp Ser Phe 
785 790 795 800 

Ser Leu Leu Leu Leu Ala Leu Pro Gin Gin Ala Tyr Ala Leu Asp Thr 
805 810 815 

Glu Val Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala 
820 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp 
835 840 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
850 855 860 
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Val Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val He Leu Leu 
865 670 875 880 

Met Cys Val Val His Pro Thr Leu Val Phe Asp He Thr Lys Leu Leu 
885 890 895 

Leu Ala He Phe Gly Pro Leu Trp He Leu Gin Ala Ser Leu Leu Lys 
900 905 910 



Val Pro Tyr Phe Val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu 
915 920 925 

Ala Arg Lys He Ala Gly Gly His Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 955 960 



Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val Val Phe Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala 
980 985 990 

Asp Thr Ala Ala Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Arg Gly Gin Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
1010 1015 1020 

Lys Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

Asn Gin Val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 



Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 
1090 1095 1100 

Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
1105 1110 1115 1120 
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Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu 
1125 1130 1135 

Val Thr Axg His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 
11S5 1160 1165 

Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Thr Arg Gly val Ala Lys Ala Val Asp Phe He 
1185 1190 1195 1200 

Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu 
1220 1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 

Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser val Ala Ala 
1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lye Ala His Gly Val Asp Pro 
1265 1270 1275 1280 

Asn lie Arg Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Val Ser His Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu 
1365 1370 1375 
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lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val lie Lye Gly Gly 
1380 1385 1390 

Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Lys Leu val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ser 
1425 1430 1435 1440 

Thr Aep Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Abp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 
1490 1495 1500 

Phe val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Abp Ser Ser Val 
1505 1510 1515 1520 

Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 
1540 1545 1550 

Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1585 1590 1595 1600 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
1620 1625 1630 
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Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr 
1635 1640 1645 

lie Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1650 1655 1660 

val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser 
1665 1670 1675 1680 

Thr Gly Cys Val Val lie Val Gly Arg lie Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala lie lie Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 

Arg His Ala Glu Val lie Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 

Leu Glu val Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly lie 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala 
1780 1785 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 

Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 1830 1835 1840 

Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp lie Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys lie 
1860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
1875 1880 1885 
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Ala He Leu Ser Pro Gly Ala Leu val val Gly Val val Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 1935 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He 
1955 1960 1965 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He Cys Olu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 1995 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 2010 2015 

Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg 
2020 2025 2030 

Cys His Cys Gly Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met 
2035 2040 2045 

Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 
2065 2070 2075 2080 

Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 
2085 2090 2095 

He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin lie Pro Ser Pro Glu Phe Phe Thr Glu 
2115 2120 2125 

Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu 
2130 2135 2140 
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Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val 
2145 2150 2155 2160 

Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys Val Val lie 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 2270 

ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala 
2275 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr 
2290 2295 2300 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 2350 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn 
2355 2360 2365 

Thr Thr Thr ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
2370 2375 2380 

Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
2385 2390 2395 2400 
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Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly 
2420 2425 2430 

Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro lie Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr 
2450 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 

Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn 
2530 2535 2540 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp Thr 
2545 2550 2555 2560 

Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly 
2565 2570 2575 

Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp 
2645 2650 2655 
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lie Arg Thr Glu Glu Ala lie Tyr Gin Cys CyB Asp Leu Asp Pro Gin 
2660 2665 2670 

Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 



Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg 
2690 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 



lie Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val lie Cys Glu Ser Ala Gly 
2740 2745 2750 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr 
2755 2760 2765 



Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 

Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly 
2785 2790 2795 2800 

Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp 
2820 2825 2830 

Leu Gly Asn lie lie Met Phe Ala Pro Thr Leu Trp Ala Arg Met lie 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu lie Ala Arg Asp Gin Leu Glu 
2850 2855 2860 

Gin Ala Leu Asn Cys Glu lie Tyr Gly Ala Cys Tyr Ser lie Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 

Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys 
2900 2905 2910 
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Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 2920 2925 

Arg Ser Val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala lie 
2930 2935 2940 

Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lye Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro lie Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr ser Gly Gly Asp lie Tyr His Ser val Ser His Ala Arg 
2980 2985 2990 

Pro Arg Trp Phe Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

lie Tyr Leu Leu Pro Asn Arg 
3010 3015 



<210> 9 
<211> 9611 
<212> DNA 

<213> Hepatitis C virus 
<400> 9 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gataaacccg 
gcaagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaaga aacaccaacc 
gcggccagat cgttggcgga gtatacttgt 
gcgcgacaag gaagacttcg gagcggtccc 
aagatcggcg ctccactggc aaatcctggg 
ggaatgaggg actcggctgg gcaggatggc 
ggggccccaa tgacccccgg cataggtcgc 
cgtgcggctt tgccgacctc atggggtaca 
tcgccagagc tctcgcgcat ggcgtgagag 
ggaacttacc cggttgctcc ttttctatct 
ccccggtctc cgctgccgaa gtgaagaaca 
gcaccaatga cagcattacc tggcagctcc 
tcccgtgcga gaaagtgggg aatgcatctc 
ccgtgcagcg gcccggcgcc ctcacgcagg 
tgtccgccac gctctgctct gccctctacg 



catgaatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcctgatagg 3 00 
gaccgtgcac catgagcaca aatcctaaac 360 
gtcgcccaca agacgttaag tttccgggcg 420 
tgccgcgcag gggccccagg ttgggtgtgc 4 80 
agccacgtgg aaggcgccag cccatcccta 540 
gaaaaccagg atacccctgg cccctatacg 600 
tcctgtcccc ccgaggttcc cgtccctctt 660 
gcaacgtggg taaggtcatc gataccctaa 720 
tccctgtcgt gggcgccccg ctcggcggcg 780 
tcctggagga cggggttaat tttgcaacag 84 0 
tcttgctggc cctgctgtcc tgcatcacca 900 
tcagtaccgg ctacatggtg actaacgact 960 
aggctgctgt cctccacgtc cccgggtgcg 1020 
agtgctggat accggtctca ccgaatgtgg 1080 
gcttgcggac gcacatcgac atggttgtga 114 0 
tgggggacct ctgcggtggg gtgatgctcg 1200 
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cagcccaaat gttcattgtc tcgccgcagc 
ccatctaccc tggtaccatc actggacacc 
cgcccacggc taccatgatc ttggcgtacg 
tcattagcgg ggctcattgg ggcgtcatgt 
cgtgggcgaa agtcgttgtc atccttctgt 
ctgttggggg ttctgccgcg cagaccaccg 
ccaggcagaa aatccagctc gttaacacca 
tgaactgcaa tgactccttg cacaccggct 
tcaactcgtc aggatgtccc gaacgcatgt 
tgggatgggg cgccttgcaa tatgaggata 
attgctggca ctacccacca aggcagtgtg 
cagtgtactg tttcaccccc agcccagtgg 
ccacttacac gtggggggag aatgagacag 
cgctggggtc atggttcggc tgcacgtgga 
gcgcaccacc ctgccgtact agagctgact 
cggactgttt taggaagcat cctgatacca 
tcacgccaag gtgcctgatc gactacccct 
actataccat cttcaaaata aggatgtatg 
catgcaattt cactcgtggg gatcgttgca 
ctcctttgtt gcactccacc acggaatggg 
ccgccttgtc gactggtctt ctccacctcc 
atggcctatc acctgccctc acaaaataca 
tcctgctctt agcggacgcc agggtttgcg 
aggccgaagc agctttggag aacctcgtaa 
acggtcttgt gtccttcctc gtgttcttct 
tgcccggagc ggtctacgcc ctctacggga 
tgcctcagcg ggcatatgca ctggacacgg 
ttgtcgggtt aatggcgctg actctgtcgc 
tgtggtggct tcagtatttt ctgaccagag 
ccctcaacgt ccgggggggg cgcgatgccg 
ccctggtatt tgacatcacc aaactactcc 
aagccagttt gcfctaaagtc ccctacttcg 
cgctagcgcg gaagatagcc ggaggtcatt 
cgcttactgg cacctatgtg tataaccatc 
gcctgcgaga tctggccgtg gctgtggaac 
tcatcacgtg gggggcagat accgccgcgt 
ctgcccgtag gggccaggag atactgcttg 
ggaggttgct ggcgcccatc acggcgtacg 
taatcaccag cctgactggc cgggacaaaa 
caactgctac ccaaaccttc ctggcaacgt 
acggggccgg aacgaggacc atcgcatcac 
atgtggacca agaccttgfcg ggctggcccg 
gtacctgcgg ctcctcggac ctttacctgg 
gccggcgagg tgatagcagg ggtagcctgc 
gctcctcggg gggtccgctg ttgtgccccg 
cggtgtgcac ccgtggagtg gctaaagcgg 
caaccatgag atccccggtg ttcacggaca 
tccaggtggc ccacctgcat gctcccaccg 



accactggtt tgtccaagac tgcaattgct 1260 

gcatggcatg ggacatgatg atgaactggt 132 0 

cgatgcgtgt ccccgaggtc attatagaca 1380 

tcggcttggc ctacttctct atgcagggag 144 0 

tggccgccgg ggtggacgcg cgcacccata 15 00 

ggcgcctcac cagcttattt gacatgggcc 1560 

atggcagctg gcacatcaac cgcaccgccc 1620 

ttatcgcgtc tctgttctac acccacagct 1680 

ccgcctgccg cagtatcgag gccttccggg 1740 

atgtcaccaa tccagaggat atgagaccct 1800 

gcgtggtctc cgcgaagact gtgtgtggcc 1860 

tagtgggcac gaccgacagg cttgga^cgc 1920 

atgtcttcct attgaacagc actcgaccac 1980 

tgaactcttc tggctacacc aagacttgcg 2 04 0 

tcaacgccag cacggacctg ttgtgcccca 2100 

cttacctcaa atgcggctct gggccctggc 2160 

acaggctctg gcattacccc tgcacagtta 222 0 

.tgggaggggt tgagcacagg ctcacggctg 228 0 

acttggagga cagagacaga agtcaactgt 234 0 

ccattttacc ttgctcttac tcggacctgc 2400 

accaaaacat cgtggacgta caattcatgt 24 60 

tcgtccgatg ggagtgggta atactcttat 252 0 

cctgcttatg gatgctcatc ttgttgggce 25 80 

tactcaatgc agcatccctg gccgggacgc 264 0 

gctttgcgtg gtatctgaag ggtaggtggg 2700 

tgtggcctct cctcctgctc ctgctggcgt 2760 

aggtggccgc gtcgtgtggc ggcgttgttc 2 82 0 

catattacaa gcgctatatc agctggtgca 2 880 

tagaagcgca actgcacgtg tgggttcccc 2 94 0 

tcatcttact catgtgtgta gtacacccga 3 000 

tggccatctt cggacccctt tggattcttc 3 060 

tgcgcgttca aggccttctc cggatctgcg 312 0 

acgtgcaaat ggccatcatc aagttagggg 3180 

tcacccctct tcgagactgg gcgcacaacg 324 0 

cagtcgtctt ctcccgaatg gagaccaagc 3 30 0 

gcggtgacat catcaacggc ttgcccgtct 3360 

ggccagccga cggaatggtc tccaaggggt 342 0 

cccagcagac gagaggcctc ctagggtgta 34 80 

accaagtgga gggtgaggtc cagatcgtgt 354 0 

gcatcaatgg ggtatgctgg actgtctacc 3600 

ccaagggtcc tgtcatccag atgtatacca 3660 

ctcctcaagg ttcccgctca ttgacaccct 372 0 

tcacgaggca cgccgatgtc attcccgtgc 3780 

tttcgccccg gcccatttcc tacttgaaag 3840 

cgggacacgc cgtgggccta ttcagggccg 3900 

tggactttat ccctgtggag aacctaggga 3 960 

actcctctcc accagcagtg ccccagagct 4 020 

gcagcggtaa gagcaccaag gtcccggctg 4 080 
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cgtacgcagc ccagggctac 
gctttggtgc ttacatgtcc 
gaacaattac cactggcagc 
gcgggtgctc aggaggtgct 
ccacatccat cttgggcatc 
tggttgtgct cgcfcactgct 
aggaggttgc tctgtccacc 
aggtgatcaa ggggggaaga 
tcgccgcgaa gctggtcgca 
tgtctgtcat cccgaccagc 
gctttaccgg cgacttcgac 
atttcagcct tgaccctacc 
ccaggactca acgccggggc 
caccggggga gcgcccctcc 
cgggctgtgc ttggtatgag 
tgaacacccc ggggcttccc 
cgggcctcac tcatatagat 
ttccttacct ggtagcgtac 
cgtgggacca gatgtggaag 
ccctgctata cagactgggc 
aatacatcat gacatgcatg 
ttggcggcgt cctggctgct 
tgggcaggat cgtcttgtcc 
aggagttcga tgagatggaa 
tgctcgctga gcagttcaag 
cagaggttat cacccctgct 
agcacatgtg gaatttcatc 
gtaaccccgc cattgcttca 
ctggccaaac cctcctcttc 
ccggtgccgc tactgccttt 
gactggggaa ggtcctcgtg 
ttgtagcatt caagatcatg 
tgcccgccat cctctcgcct 
gccggcacgt tggcccgggc 
cctcccgggg gaaccatgtt 
gcgtcactgc catactcagc 
ggataagctc ggagtgtacc 
ggatatgcga ggtgctgagc 
tgcctgggat tccctttgtg 
gcattatgca cactcgctgc 
cgatgaggat cgtcggtcct 
acgcctacac cacgggcccc 
ggagggtgtc tgcagaggaa 
cgggtatgac tactgacaat 
cagaattgga cggggtgcgc 
aggaggtatc attcagagta 
agcccgaacc ggacgtagcc 
cagaggcggc cgggagaagg 



aaggtgttgg tgctcaaccc 
aaggcccatg gggttgatcc 
cccatcacgt actccaccta 
tatgacataa taatttgtga 
ggcactgtcc ttgaccaagc 
acccctccgg gctccgtcac 
accggagaga tcccctttta 
catctcatct tctgccactc 
ttgggcatca atgccgtggc 
ggcgatgttg tcgtcgtgtc 
tctgtgatag actgcaacac 
tttaccattg agacaaccac 
aggactggca gggggaagcc 
ggcatgttcg actcgtccgt 
ctcacgcccg ccgagactac 
gtgtgccagg accatcttga 
gcccactttt tatcccagac 
caagccaccg tgtgcgctag 
tgtttgatcc gccttaaacc 
gctgttcaga atgaagtcac 
tcggccgacc tggaggtcgt 
ctggccgcgt attgcctgtc 
gggaagccgg caattatacc 
gagtgctctc agcacttacc 
cagaaggccc tcggcctcct 
gtccagacca actggcagaa 
agtgggatac aatacttggc 
ttgatggctt ttacagctgc 
aacatattgg gggggtgggt 

gtgggtgctg gcctagctgg 

gacattcttg cagggtatgg 
agcggtgagg tcccctccac 
ggagcccttg tagtcggtgt 
gagggggcag tgcaatggat 
tcccccacgc actacgtgcc 
agcctcactg taacccagct 
actccatgct ccggttcctg 
gactttaaga cctggctgaa 
tcctgccagc gcgggtatag 
cactgtggag ctgagatcac 
aggacctgca ggaacatgtg 
tgtactcccc ttcctgcgcc 
tacgtggaga taaggcgggt 
cttaaatgcc cgtgccagat 
ctacacaggt ttgcgccccc 
ggactccacg agtacccggt 
gtgttgacgt ccatgctcac 
ttggcgagag ggtcaccccc 



ctctgttgct gcaacgctgg 4140 
taatatcagg accggggtga 4200 
cggcaagttc cttgccgacg 4260 
cgagtgccac tccacggatg 4320 
agagactgcg ggggcgagac 4380 
tgtgtcccat cctaacatcg 444 0 
cggcaaggct atccccctcg 4500 
aaagaagaag tgcgacgagc 4560 
ctactaccgc ggtcttgacg 462 0 
gaccgatgct ctcatgactg 4680 
gtgtgtcact cagacagtcg 4740 
gctcccccag gatgctgtct 4800 
aggcatctat agatttgtgg 4860 
cctctgtgag tgctatgacg 4 92 0 
agttaggcta cgagcgtaca 4 980 
attttgggag ggcgtcttta 504 0 
aaagcagagt ggggagaact 5100 
ggctcaagcc cctcccccat 5160 
caccctccat gggccaacac 5220 
cctgacgcac ccaatcacca 5280 
cacgagcacc tgggtgctcg 5340 
aacaggctgc gtggtcatag 5400 
tgacagggag gttctctacc 5460 
gtacatcgag caagggatga 5520 
gcagaccgcg tcccgccatg 5580 
actcgaggtc ttttgggcga 5640 
gggcctgtca acgctgcctg 5700 
cgtcaccagc ccactaacca 5760 
ggctgcccag ctcgccgccc 5820 
cgccgccatc ggcagcgttg 5880 
cgcgggcgtg gcgggagctc 594 0 
ggaggacctg gtcaatctgc 6000 
ggtctgcgca gcaatactgc 6060 
gaaccggcta atagccttcg 6120 
ggagagcgat gcagccgccc 6180 
cctgaggcga ctgcatcagt 6240 
gctaagggac atctgggact 6300 
agccaagctc atgccacaac 6360 
gggggtctgg cgaggagacg 6420 
tggacatgtc aaaaacggga 6480 
gagtgggacg ttccccatta 6540 
gaactataag ttcgcgctgt 6600 
gggggacttC cactacgtat 6660 
cccatcgccc gaatttttca 6720 
ttgcaagccc ttgctgcggg 6780 
ggggtcgcaa ttaccttgcg 6840 
tgatccctcc catataacag 6900 
ttctatggcc agctcctcgg 6960 
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ctagccagct gtccgctcca tctctcaagg 
acgccgagct catagaggct aacctcctgt 
gggttgagtc agagaacaaa gtggtgattc 
aggatgagcg ggaggtctcc gtacctgcag 
gggccctgcc cgtctgggcg cggccggact 
agcctgacta cgaaccacct gtggtccatg 
ctgtgcctcc gcctcggaaa aagcgtacgg 
ccttggccga gcttgccacc aaaagttttg 
acaatacgac aacatcctct gagcccgccc 
agtcctattc ttccatgccc cccctggagg 
ggtcatggtc gacggtcagt agtggggccg 
cttattcctg gacaggcgca ctcgtcaccc 
tcaacgcact gagcaactcg ttgctacgcc 
gcagtgcttg ccaaaggcag aagaaagtca 
attaccagga cgtgctcaag gaggtcaaag 
tatccgtaga ggaagcttgc agcctgacgc 
atggggcaaa agacgtccgt tgccatgcca 
ggaaagacct tctggaagac agtgtaacac 
aggttttctg cgttcagcct gagaaggggg 
ccgacctggg cgtgcgcgtg tgcgagaaga 
ccctggccgt gatgggaagc tcctacggat 
tcctcgtgca agcgtggaag tccaagaaga 
gttttgactc cacagtcact gagagcgaca 
gtgacctgga cccccaagcc cgcgtggcca 
ggggccctct taccaattca aggggggaaa 
gcgtactgac aactagctgt ggtaacaccc 
gtcgagccgc agggctccag gactgcacca 
tctgtgaaag tgcgggggtc caggaggacg 
tgaccaggta ctccgccccc cccggggacc 
taacatcatg ctcctccaac gtgtcagtcg 
accttacccg tgaccctaca acccccctcg 
ctccagtcaa ttcctggcta ggcaacataa 
tgatactgat gacccatttc tttagcgtcc 
ttaactgtga gatctacgga gcctgctact 
ttcaaagact ccatggcctc agcgcatttt 
atagggtggc cgcatgcctc agaaaacttg 
gggcccggag cgtccgcgct aggcttctgt 
agtacctctt caactgggca gtaagaacaa 
gccggctgga cttgtccggt tggttcacgg 
gcgtgtctca tgcccggccc cgctggttct 
taggcatcta cctcctcccc aaccgatgaa 
catttcctgt tttttttttt tttttttttt 
ttcttttttt cctttctttt tcccttcttt 
gctagctgtg aaaggtccgt gagccgcatg 
gcagatcatg t 



caacttgcac cgccaaccat gactcccctg 7020 
ggaggcagga gatgggcggc aacatcacca 7080 
tggactcctt cgatccgctt gtggcagagg 714 0 
aaattctgcg gaagtctcgg agattcgccc 7200 
acaacccccc gctagtagag acgtggaaaa 7260 
gctgcccgct accacctcca cggtcccctc 7320 
tggtcctcac cgaatcaacc ctatctactg 7380 
gcagctcctc aacttccggc attacgggcg 744 0 
cttctggctg cccccccgac tccgacgttg 7500 
gggagcctgg ggatccggat ctcagcgacg 7560 
acacggaaga tgtcgtgtgc tgctcaatgt 762 0 
cgtgcgctgc ggaagaacaa aaactgccca 7680 
atcacaatct ggtgtattcc accacttcac 7740 
catttgacag actgcaagtt ctggacagcc 7 800 
cagcggcgtc aaaagtgaag gctaacttgc 7860 
ccccacattc agccaaatcc aagtttggct 7920 
gaaaggccgt agcccacatc aactccgtgt 7980 
caatagacac taccatcatg gccaagaacg 804 0 
gtcgtaagcc agctcgtctc atcgtgttcc 8100 
tggccctgta cgacgtggtt agcaagctcc 8160 
tccaatactc accaggacag cgggttgaat 822 0 
ccccgatggg gttctcgtat gatacccgct 82 80 
tccgtacgga ggaggcaatt taccaatgtt 8340 
tcaagtccct cactgagagg ctttatgttg 84 op 
actgcggcta ccgcaggtgc cgcgcgagcg 8460 
tcacttgcta catcaaggcc cgggcagcct 8 520 
tgctcgtgtg tggcgacgac ttagtcgtta 8580 
cggcgagcct gagagccttc acggaggcta 864 0 
ccccacaacc agaatacgac ttggagctta 8700 
cccacgacgg cgctggaaag agggtctact 8760 
cgagagccgc gtgggagaca gcaagacaca 8820 
tcatgtttgc ccccacactg tgggcgagga 8880 
tcatagccag ggatcagctt gaacaggctc 8940 
ccatagaacc actggatcta cctccaatca 9000 
cactccacag ttactctcca ggtgaaatca 9060 
gggtcccgcc cttgcgagct tggagacacc 9120 
ccagaggagg cagggctgct atatgtggca 9180 
agctcaaact cactccaata gcggccgctg 9240 
ctggctacag cgggggagac atttatcaca 9300 
ggttttgcct actcctgctc gctgcagggg 9360 
ggttggggta aacactccgg cctcttaagc 9420 
tttttttctt tttttttttc tttcctttcc 9480 
aatggtggct ccatcttagc cctagtcacg 9540 
actgcagaga gtgctgatac tggcctctct 9600 

9611 



<210> 10 
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<211> 3015 
<212> PRT 

<213> Hepatitis C virus 
<400> 10 

Met Ser Thr Asn Pro Lye Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 ' 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys lie Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys Asn lie Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 

Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp He 
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225 230 235 240 

Pro Val Ser Pro Asn Val Ala val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His lie Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 285 

Gin Met Phe lie Val Ser Pro Gin H1b His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser lie Tyr Pro Gly Thr lie Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met lie Leu Ala Tyr 
325 330 335 

Ala Met Arg val Pro Glu Val lie lie Asp lie lie Ser Gly Ala His 
340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys lie oln Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe lie Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser lie Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 

Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
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485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Qlu Asn Glu Thr Asp Val Phe Leu Leu Asm Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro ABp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
595 600 605 

Pro Arg Cys Leu lie Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
610 ' 615 620 

Thr Val Asn Tyr Thr lie Phe Lys lie Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala lie Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp Val Gin 
690 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr lie Val Arg Trp 
705 710 715 720 

Glu Trp Val lie Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 

Ala Cys Leu Trp Met Leu He Leu Leu Gly Gin Ala Glu Ala Ala Leu 
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740 745 750 

Glu Asn Leu Val lie Leu Asn Ala Ala Ser Leu Ala Gly Thr His Qly 
755 760 765 

Leu Val Ser Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly 
770 775 780 

Arg Trp Val Pro Gly Ala Val Tyr Ala Leu Tyr Gly Met Trp Pro Leu 
785 790 795 800 

Leu Leu Leu Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr 
805 810 815 

Glu Val Ala Ala Ser CyB Gly Gly Val Val Leu Val Gly Leu Met Ala 
820 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lya Arg Tyr lie Ser Trp Cys Met Trp 
835 840 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
850 855 860 

Val Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val lie Leu Leu 
865 870 875 880 

Met Cys Val Val His Pro Thr Leu Val Phe Asp lie Thr Lys Leu Leu 
885 890 895 

Leu Ala lie Phe Gly Pro Leu Trp lie Leu Gin Ala Ser Leu Leu Lys 
900 905 910 

Val Pro Tyr Phe val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu 
915 920 925 

Ala Arg Lys He Ala Gly Gly Hie Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 . 955 960 

Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val val Phe Ser Arg Met Glu Thr Lye Leu He Thr Trp Gly Ala 
980 985 990 

Asp Thr Ala Ala Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala 
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995 1000 1005 

Arg Arg Qly Gin Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
1010 1015 1020 

Lye Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys lie He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

Asn Gin val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 
1090 1095 1100 

Tyr Thr Asn Val Asp Gin Asp Leu val Gly Trp Pro Ala Pro Gin Gly 
1105 1110 1115 1120 

Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu 
1X25 1130 1135 

Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 
1155 1160 1165 

Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He 
1185 1190 1195 1200 

Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu 
1220 1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 

Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala 
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1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly val Asp Pro 
1265 1270 1275 1280 

Asn lie Arg Thr Gly val Arg Thr He Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp lie He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Val Ser His Pro Asn He Glu Glu val Ala Leu ser Thr Thr Gly Glu 
1365 1370 1375 

He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly 
1380 1385 1390 

Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala Tyx Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ser 
1425 1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 
1490 1495 1500 

Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val 
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1505 1510 1515 1520 

Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 
1540 1545 1550 

Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His lie Abp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1585 1590 1595 1600 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
1620 1625 1630 

Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr 
1635 ' 1640 1645 

lie Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1650 1655 1660 

Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr CyB Leu Ser 
1665 1670 1675 16B0 

Thr Gly Cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala He He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 

Arg His Ala Glu Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 

Leu Glu Val Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He 
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1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala 
1780 1785 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 

Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 1830 1835 1840 

Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp He Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
1860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
1875 1880 1885 

Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
1890 ' 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 1935 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He 
1955 1960 1965 

Ser Ser Glu Cys Thr Thr Pro CyB Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 1995 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 2010 2015 

Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg 
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2020 2025 2030 

Cys His Cys Gly Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met 
2035 2040 2045 

Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 
2065 2070 2075 2080 

Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 
2085 2090 2095 

He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu 
2115 2120 2125 

Leu Asp Gly val Arg Leu HiB Arg Phe Ala Pro Pro Cys Lys Pro Leu 
2130 2135 2140 

Leu Arg Glu Glu Val Ser Phe Arg val Gly Leu His Glu Tyr Pro Val 
2145 2150 2155 2160 

Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys Val Val He 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 2270 

Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala 
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227S 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Qlu Thr 
2290 2295 2300 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 2350 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly lie Thr Gly Asp Asn 
2355 2360 2365 

Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
2370 2375 2380 

Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
2385 2390 2395 2400 

Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly 
2420 2425 2430 

Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro lie Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr 
2450 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lye Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 

Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His lie Asn 
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2530 2535 2540 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro lie Asp Thr 
2545 2550 2555 2560 

Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly 
2565 2570 2575 

Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp 
2645 2650 2655 

lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu Asp Pro Gin 
2660 ' 2665 2670 

Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 

Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg 
2690 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 

He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly 
2740 2745 27S0 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr 
2755 2760 2765 

Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 

Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly 
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2785 2790 2795 2800 

Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp 
2820 2825 2830 

V 

Leu Gly Asn lie He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu 
2850 2855 2860 

Gin Ala Leu Asn Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 

Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys 
2900 2905 2910 

Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 ' 2920 2925 

Arg Ser Val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He 
2930 2935 2940 

Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro He Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg 
2980 2985 2990 

Pro Arg Trp Phe Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

He Tyr Leu Leu Pro Asn Arg 
3010 3015 



<210> 11 
<211> 24 
<212> DNA 
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<213> Hepatitis C virus 
<400> 11 

actggacacg gaggtggccg cgtc 24 



<210> 12 
<211> 24 
<212> DNA 

<213> Hepatitis C virus 
<400> 12 

ttgttcttgt cgggttaatg gcgc 24 



<210> 13 
<211> 24 
<212> DNA 

<213> Hepatitis C virus 
<400> 13 

gggtgtacta cacacatgag taag 24 



<210> 14 
<211> 22 
<212> DNA 

<213> Hepatitis C virus 
<400> 14 

aagcgcccct aacttgatga tg 22 



<210> 15 
<211> 40 
<212> DNA 

<213> Hepatitis C virus 
<400> 15 

cgtcatcgat acctcagcgg gcatatgcac tggacacgga 40 



<210> 16 
<2ll> 24 
<212> DNA 

<213> Hepatitis C virus 
<400> 16 
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24 



<210> 17 
<211> 32 
<212> DNA 

<213> Hepatitis C virus 
<400> 17 

catgcaccag ctgatatagc gcttgtaata tg 32 



<210> 18 
<211> 30 
<r212> DNA 

<213> Hepatitis C virus 
<400> 18 

tccgtagagg aagcttgcag cctgacgccc 30 



<210> 19 
<211> 34 
<212> DNA 

<213> Hepatitis C virus 
<400> 19 

cagaggaggc agggctgcta tatgtggcaa gtac 34 



<210> 20 
<211> 34 
<212> DNA 

<213> Hepatitis c virus 

<400> 20 

gtacttgcca catatagcag ccctgcctcc tctg 34 



<210> 21 
<211> 43 
<212> DNA 

<213> Hepatitis C viruB 
<400> 21 

cgtctctaga caggaaatgg cttaagaggc cggagtgttt acc 43 
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<210> 22 
<211> 65 
<212> DNA 

<213> Hepatitis C virus 
<40O> 22 

ttatggatgc tcatcttgtt gggccaggcc gaagcagctt tggagaacct cgtaatactc 60 
aatgc 65 



<210> 23 
<211> 32 
<212> DNA 

<213> Hepatitis C virus 
<400> 23 

aggatttgtg ctcatggtgc acggtctacg ag 32 



<210> 24 
<211> 50 
<212> DNA 

<213> Hepatitis C virus 
<400> 24 

ttttttttgc ggccgctaat acgactcact atagacccgc ccctaatagg 50 



<210> 25 
<211> 31 
<212> DNA 

<213> Hepatitis C virus 
<400> 25 

ccgtgcacca tgagcacaaa tcctaaacct c 31 



<210> 26 
<211> 26 
<212> DNA 

<213> Hepatitis C virus 
<400> 26 

ggatgtaccc catgaggtcg gcaaag 26 



<210> 27 
<211> 30 
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<212> DNA 



<213> Hepatitis C virus 



<400> 27 



gtttgcgcct gcttatggat gctcatcttg 



30 



<210> 28 
<211> 26 
<212> DNA 

<213> Hepatitis C virus 
<400> 28 

gcgtcataag catatgcctg ttgggg 26 

<210> 29 
<211> 23 
<212> DNA 

<213> Hepatitis C virus 



<;210> 30 
<211> 39 
<212> DNA 

<213> Hepatitis C virus 
<400> 30 

cgtcatgcat acccctaggg cggctctcat tgaagaggg 3 9 

<210> 31 
<211> 30 
<212> DNA 

<213> Hepatitis C virus 



<400> 29 



ccctcagcac tggagtacat ctg 



23 



<400> 31 



cgtcccctct tcaatgagag ccgctctaga 



30 



<210> 32 



<211> 28 



<212> DNA 



<213> Hepatitis C virus 
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<400> 32 

gcggtgaaga ccaagctcaa actcactc 



<210> 33 
<211> 41 
<212> DNA 

<213> Hepatitis C virus 



<400> 33 

aatctagaag gcgcgcttcc ggcaatggag tgagtttgag 



<210> 34 
<211> 38 
<212> DNA 

<213> Hepatitis C virus 
<400> 34 

cgtctctaga ggataaatcc aggaggcgcg cttccggc 



<210> 35 
<211> 27 
<212> DNA 

<213> Hepatitis C virus 
<400> 35 

tactttttgt aggggtaggc cttttcc 



<210> 36 
<211> 34 
<212> DNA 

<213> Hepatitis C virus 
<400> 36 

cgtctctaga gtgtagctaa tgtgtgccgc tcta 



<210> 37 

<211> 24 
<212> DNA 

<213> Hepatitis C virus 
<400> 37 

ctatggagtg tagctaatgt gtgc 
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<210> 38 
<211> 66 
<212> DNA 

<213> Hepatitis C virus 
<400> 38 

cgtctctaga catgatctgc agagagacca gttacggcac tctctgcagt catgcggctc 60 
acggac 66 



<210> 39 
<211> 41 
<212> DNA 

<213> Hepatitis C virus 
<400> 39 

ctttcacagc tagccgtgac tagggctaag atggagccac c 41 
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